Date: Tue, 20 Jul 2010 09:20:35 -0700
Reply-To: "Terjeson, Mark" <Mterjeson@RUSSELL.COM>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: "Terjeson, Mark" <Mterjeson@RUSSELL.COM>
Subject: Re: Performance and disk write speed
In-Reply-To: <COL118-W63CF6989F0AF474DEAFDE3F1A00@phx.gbl>
Content-Type: text/plain; charset="us-ascii"
Yes, striped volumes across multiple disks are faster.
It is also good to have these mirrored because if a
single disk goes out the whole volume is lost.
I recall where I used to work we had some disks added
and we had really slow performance and they fixed the
striping and woohoo!
________________________________
From: Michelle Zunnurain [mailto:michelle_zunnurain@hotmail.com]
Sent: Tuesday, July 20, 2010 9:08 AM
To: Terjeson, Mark; SAS-L LISTSERV.UGA.EDU
Subject: RE: Performance and disk write speed
The code I wish to replace is usually a datastep merge or SQL.
The code I am testing is usually hash join.
I am running all three methods and picking the fastest.
I tested whether the writespeed was faster in workspace
than folder space, there was no difference.
Something I found out from IT is related to "striped" vs
unstriped disks. Striped disks write faster than unstriped?
________________________________
Subject: RE: Performance and disk write speed
Date: Tue, 20 Jul 2010 08:57:15 -0700
From: Mterjeson@russell.com
To: michelle_zunnurain@hotmail.com
Hi Michelle,
I'm not sure we have asked about the methods of the writes.
i.e. Is the code being measured a datastep, or SQL, or PROC,
or ??? The method of how it is being written could potentially
be part of the problem. You may have already ironed this out,
but I am just checking that we are chasing the right dog.
The drastic example is SQL writes, if the query is a table write
it is going to run quickly. Same for bulkcopy type operations,
the writing in large chunks is going to run faster. Some table
copy modes use INSERTs and that results in very, very slow
operation because it is writing one row at a time and the overhead
for each and every row.
You have probably ruled out anything obvious such as this.
Mark
________________________________
From: Michelle Zunnurain [mailto:michelle_zunnurain@hotmail.com]
Sent: Wednesday, July 14, 2010 8:42 PM
To: Terjeson, Mark; SAS-L LISTSERV.UGA.EDU
Subject: RE: Performance and disk write speed
Thanks Mark, for your initial thoughts.
Any suggestions on specific questions to ask the IT people?
I did a rough estimate tonight, the speed
my file was writing at was about 33MB per second
I calculated that by taking a timestamp and
the filesize, then waiting 4 minutes, then doing
the same thing again. Subtracted the first size
from the second and divided by seconds.
I have tried a lot of the techniques you mentioned,
plus hash joins, combining data steps, etc.
There is a lot more I can do, but it takes a lot of
time to do parallel testing to make sure the new
results match the old results, and a lot of tracing
through the process to see what comes from where
and what is it used for down the road.
Thanks for your input.
> Subject: RE: Performance and disk write speed
> Date: Wed, 14 Jul 2010 16:02:38 -0700
> From: Mterjeson@russell.com
> To: michelle_zunnurain@HOTMAIL.COM; SAS-L@LISTSERV.UGA.EDU
>
> Hi Michelle,
>
> Just some initial thoughts. Typically Unix is
> much faster than Windows and Windows networks
> and why many mid and large size organizations
> have a Unix box just because of that. However,
> we need to start clarifying all of your specifics
> for the application or job you are running. If
> you are processing on Unix and writing to the
> same Unix box and the write is slow then merely
> reducing volume of bytes. i.e. reduce rowcount
> if possible, reduce lengthy strings to normalized
> code and code lookup procedures, reduce widths of
> values and their variables where possible. i.e.
> just reduce the bulk. On the other hand IF you
> are processing on the Unix box and writing to
> another box or network then that is your biggest
> culprit(i/o). Whatever bytes you send over the
> "wire" from one box to the other box is the slowest
> link in the chain. Again, reducing bulk is one
> factor, and possible hardware investigations or
> changes may enhance the throughput. Check with
> your IT people for possibilities if that is where
> you isolate much of the slowdown to. etc., etc.
> Just some initial thoughts.
>
>
>
> Hope this is helpful.
>
>
> Mark Terjeson
> Investment Business Intelligence
> Investment Management & Research
> Russell Investments
> 253-439-2367
>
>
> Russell
> Global Leaders in Multi-Manager Investing
>
>
>
>
>
>
>
>
>
>
>
> -----Original Message-----
> From: SAS(r) Discussion [mailto:SAS-L@LISTSERV.UGA.EDU] On Behalf Of
> Michelle Zunnurain
> Sent: Wednesday, July 14, 2010 3:14 PM
> To: SAS-L@LISTSERV.UGA.EDU
> Subject: Performance and disk write speed
>
> Hello List,
>
>
>
> After going through a lot of SAS logs, and attempting
>
> to make performance improvements using keep/drop etc.,
>
> perusing the logs using %logparse to check real time
>
> cpu time, obs read, obs written, it looks like the run times
>
> boils down to disk write speed.
>
>
>
> Besides timing an actual job, is there any way to check
>
> "write speed" and more importantly improve write speed?
>
>
>
> We are operating in a Unix environment (HP-UX).
>
>
>
> Just curious.
>
>
>
> Michelle Z
>
>
>
> P. S. Please don't suggest a mainframe.
>
>
>
>
>
>
>
>
>
>
>
> _________________________________________________________________
> The New Busy is not the too busy. Combine all your e-mail accounts
with
> Hotmail.
>
http://www.windowslive.com/campaign/thenewbusy?tile=multiaccount&ocid=PI
> D28326::T:WLMTAGL:ON:WL:en-US:WM_HMP:042010_4
________________________________
Hotmail is redefining busy with tools for the New Busy. Get more from
your inbox. See how.
<http://www.windowslive.com/campaign/thenewbusy?ocid=PID28326::T:WLMTAGL
:ON:WL:en-US:WM_HMP:042010_2>
________________________________
The New Busy is not the too busy. Combine all your e-mail accounts with
Hotmail. Get busy.
<http://www.windowslive.com/campaign/thenewbusy?tile=multiaccount&ocid=P
ID28326::T:WLMTAGL:ON:WL:en-US:WM_HMP:042010_4>
|