Group 4: Bruce Bayne, April Brice, Andrew Pasquale, and Mark Wheeler
As part of this multifaceted assignment we converted point-based precipitation data of Southern California into rasterized grids of average annual precipitation. This grid was used along with the drainage network, flow direction, and flow accumulation grids created from the previous labs TOPOGRID output to compute mean annual discharges and the drainage network density. These data were analyzed to produce statistics that were then used to speculate about some of the erosional processes and discharge intensities in the subject area.Data Source:
Precipitation and Station ID Source
Precipitation and Station Identification data were obtained for us,
but are available from the NCDC homepage
as directed in the
assignment. To save server space we created a working link to
these data using a symbolic link command:
The DEM used in this assignment was created in the previous lab using
TOPOGRID. The basic processes and outlines for using the DEM in this
assignment are laid out nicely in the Arc/Info help under Flow - Deriving
surface runoff characteristics.
This process is depicted in the below chart:

Rain gauge station
dataset coop.txt.
This data set included all rain gauge station for
California so we began by selecting rain gauge stations that were within
one degree of our study area. Keeping rain gauge data just outside
of our study area will allow for greater precision when it comes to averaging
the precipitation data because A/I will be interpolating rather then extrapolating
for most areas. This was accomplished using MS Excel
by sorting for LONGITUDE -118 to -120 degrees and LATITUDE 34 to 35 degrees.
All records outside of these values were discarded.
There were numerous replicate station identifications
that needed to be parsed out. We first approached this by looking
for the stations that had been active for the longest period of time, but
soon realized that a large amount of time at one location, might not equal
the numbers of years spent accumulatively at other locations and didn't
necessarily mean a better gauge location. Since the difference in
positions of these replicate stations was relatively small (50m) we opted
to simplify this process by selecting Station Ids and lat/long locations
based on the first record for that station id in the dataset. These
tended to be the present location.
We attempted to do this station parsing using an awk command:
- awk '$5 ~ /CA/ ' coop.txt
> stations.txt ....and several other variations of this.
The awk didn't work and so we used MS Excel:
- Calculated x,y values
using formula x = degrees + (minutes/60) + (seconds/360), y = (degrees
- (minutes/60) + (seconds/360)
- Extracted years from begin and
end fields using the formula MID({cell},1,4).
- Saved as comma-delimited text
file - for use with the GENERATE command in ARC/INFO)
Precipitation dataset
caprecipt.txt
Unique Station
ID and Averaging Dataset:
The precipitation data contained records with erroneous values of 99999
because of incomplete yearly data and what-not. So the first process
was to purge the dataset of these records. We used the following
awk
command for this:
- awk '{if ($NF != 99999)
print $1, $NF}' caprecip.txt > caprecip2.txt
Then we needed to average the data and remove duplicate station identifications.
Again awk scripts were created for this purpose. First to sort through
precipitation file and average annual precip by unique station id:
- BEGIN { n_years = 0; sum_pcp
= 0.0; stat_id = -1;}{ if ($1 == stat_id) {n_years = n_years + 1; sum_pcp
= sum_pcp + $2;}
else {if
(n_years > 0) {printf("%s %10.3f\n", stat_id, sum_pcp/n_years);}sum_pcp
= $2; n_years = 1; stat_id = $1; }}END
{print stat_id,
sum_pcp/n_years; print "END";}
Then to remove duplicate stations and convert to decimal degrees.
- BEGIN { stat_id = -1;}{
if ($1 != stat_id) { printf("%s %f\n", $1, ($NF-3 - (($NF-2)/60) - (($NF-1)/360)),
($NF-6 + (($NF-5)/60) + (($NF-4)/360)));
stat_id
= $1;}END { print "END";}
Points Coverage:
This text file was then used to create a points coverage using the
GENERATE command in A/I:
arc: generate
generate: precip
generate: input stations.txt
generate: points
generate: quit
arc: build precip points
This reprojected point coverage is depicted below laid over the SINK FILLED Topogrid:
Rain Gauge Stations with Sink
Filled Topogrid
Joining Station Data
With Precipitation:
The precipitation data required an info table in order to join to the
precip.pat table that was created during the build for points process above.
This was accomplished using TABLES in A/I:
In TABLES create a new table
to import all the values of the precipitation text file. We used
DEFINE command to create pcptbl to contain to items
<precip-id> with the
same parameters as precip.pat precip-id 4,5,B and the mean annual precipitation
item as <avg_ann_pcp> 8,12,F,3.
Then used the ADD command
to add all files from the precipitation text file(with no headings just
numeric values).
Next pcptbl was joined to
precip.pat using JOINITEM command at arc:
- JOINITEM precip.pat pcptbl
precip.pat precip-id precip-id
Reprojection of
Joined Station/Preciptation Data:
Then the joined file needed to be reprojected from lat/long to the
grid coordinate system so that it could be overlain on top of the
filled topogrid <filled_ca>.
During this process we accidentally built the precipitation coverage
as polygon coverage. To reclaim the points we used ARCEDIT to select
(SEL) all points and then used the PUT command to create a new coverage
(pcp).
Then in GRID completed the following PROJECT commands:
- Grid: project cover pcp
pcp_utm
- input: projection
geographic, units dd, nad27, parameters
- output: projection
utm, units meters spheroid clarke1866, zone 11, datum nad27, parameters
Checked the projection file and built for points as <pcp_utm>
Renamed pcp_utm as pcp (point coverage)
Gridding Mean Annual Precipitation
We experimented with several methods of gridding the mean annual precipitation data for use with weighting the forthcoming FLOWACCUMULATION grid. Specifically, we tried Inverse Distance Weighting, Spline, Pointinterpret, and Resample with Pointgrid. We had the greatest success with Spline, but some of the results of the other methods are worth reviewing.
Inverse Distance Weighting
(IDW):
Our first attempt was by using IDW with the following inputs:
- Grid: pcpgrid = IDW (pcp_utm,
avg_ann_pcp)
- We used a cell size of
50
This lovely little grid took about 24 hours to run. The results
are presented below displayed with as a gridpaint_linear rainbow1.
Inverse Distance Weighting (IDW)
Grid of Mean Annual Precipitation
We tried several variations using IDW but none to our satisfaction so
we moved on to other methods. The final IDW that was tried included
commands to use a search radius of 5000 meters, a minimum of 3 data points
for interpolation, and a resolution of 100:
- Grid: pcpgrid2 = idw (pcp2, avg_ann_pcp, #, 3,
radius, 5000, 3, 100)
Unfortunately the processing time on this run exceeded 48 hours and
was finally terminated.
POINTINTERP
POINTINTERP provides the option of including IDW in the processing
and looked worthwhile.
- Grid: pcpgrid = pointinterp
( pcp2, avg_ann_pcp, 100, IDW, #, #, circle)
The results showed that the circle option disproportionately weighted
the values of points. Points highly clustered had a more narrowly
defined range of precip values, but also rigidly defined the values based
on the radius.
POINTINTERP also allows for exponential weighting and so we tried that
as follows:
- Grid: pcpgrid2 = pointinterp(pcp_utm,
avg_ann_pcp, 50, exp) all other parameters at default.
These results were also unsatisfactory.
POINTGRID and RESAMPLE
POINTGRID and then RESAMPLE were also tried, but this didn’t work since
it didn’t know what to do with the NODATA cells which were everywhere except
where the points were. Oh well.... Bringing us to..
SPLINE
A grid created with Spline is interpolated using a function that fits
a smooth surface through the points with minimum curvature. The REGULARIZE
option makes a smoother surface than the TENSION option. The first
grid was specified so that the formula should use the 8 closest points
and a cell size of 200 meters.:
- splinegrid
= spline( pcp2, avg_ann_pcp, regularized, #, 8, 200)
The results of this grid showed that SPLINE interpolated minimum and
max values that were too far out of our data range to be acceptable.
Our data ranges from about 300 to 4306 in 100th of inches and the spline
grid ranged from –61946 to 118058. The impact of this error is that
the variation that does exist for our data is damped by these extremes
and visual results in the following:
SPLINE Grid of Mean Annual Precipitation
Using REGULARIZE Option
So we ran another SPLINE grid using the TENSION option trying to get
an output that more realistically conforms to the actual data.:
- Grid: splinegrid2 = spline
(pcp2, avg_ann_pcp, tension, 5, 8, 200)
This grid looks fairly good. Minimum values are still too low
(-1384) but the max is good (4355).
The following image shows the extreme negative points in black:
SPLINE Grid of Mean Annual Precipitation
Using TENSION Option
This seemed to be about as good as it gets but we needed to do something
with the negative precipitation values. Essentially we needed to
determine a reasonable value below our minimum precipitation to assign
to the extreme negative values. We reasoned that the 4355 max value
is 49 points greater than the data set values and following this limit
we set everything below a value of 251(300-49) equal to251. This
was accomplished by first attempting to build a VAT file that could be
edited, but received the following error:
- Grid: buildvat splinegrid2
VatBuild: Can't
create VAT for float-type cell layer
ERROR in building
VAT
Then tried a select command using the CON – which works like and IF,Then
statement if value < 251, then 251 else.:
- Grid: splinegrid3 = con(splinegrid2
< 251, 251, splinegrid2)
This output looks good and by comparing this to the TOPOGRID shows
that there may be some discrepancies along the coast but overall the output
seems reasonable.
'Corrected' SPLINE Grid of Mean
Annual Precipitation
Using TENSION Option
'Corrected' SPLINE Grid of Mean
Annual Precipitation
Shown with FILLED Topogrid
Working with the Topogrid (FLOWDIRECTION, SINK, FILL, FLOWACCUMULATION, BASIN)
FLOWDIRECTION
Creating the flow direction grid was simply a matter of executing the
FLOWDIRECTION command from Grid using the final topogrid coverage completed
during the last lab:
- Grid: flow_dir = flowdirection
(oceanfix)
Flow direction looks very similar to the aspect grids that were created
in the previous lab.
The below image shows flow direction using GRIDSHADE and is followed
by an image of the histogram for the flowdirection file. Note the
extreme weighting of the direction towards 0 or north.
Flow Direction Grid
Flow Direction Histogram
SINK
The next step is to define the sinks and identifying sink depth to
specify z-value for the FILL command. This was accomplished using
the WATERSHED, ZONALFILL, and ZONALMIN commands as follows:
- Grid: sink_areas = watershed
(flow_dir, sink (flow_dir))
- Grid: sink_depth = zonalfill
(sink_areas, oceanfix) - zonalmin (sink_areas, oceanfix)
- Grid: Describe sink_depth
reveals
a maximum Z value of 113
FILL
Using the FILL command the sinks up to a depth of 113 meters were filled:
- Grid: Fill ocean_fix filled_ca
sink 113 flow_dir
Filled 13507
sinks
1358
48
2
2
Later this process was repeated to speed up the processing of the flowaccumulation.
At that point we resampled the elevation grid to a cell size of 200 meters.
- GRID: Filled_ca2 = resample
(filled_ca, 200)
And then filled the sinks as above.
FLOWACCUMULATION
The flow accumulation grid was created by inputing the flow direction
grid and weighting this with the spline grid containing the mean annual
precipitation:
- Grid: flowacc = flowaccumulation(
flowdirgrid, splinegrid3)
The output grid for the flow accumulation is depicted below
Flow Accumulation Grid
BASIN
Though not part of the assignment, we wanted to see what the BASIN
tool might produce. This is a simple tool that only requires input
of the flow direction grid and computes watersheds or basins based on breaks
in those directions, i.e., ridgelines. The output grid included 699
basins within our study area. As can be seen in the below image the
majority of those basins are clumped along the shoreline. We didn't
spend too much time analyzing this grid, but given abrupt topography along
the coast it is probable that there would be numerous smaller basins in
that area. There may be some problems with areas that border on the
edge of the project area. Some of these areas may combine into larger
basins if a large extent was considered. The below image depicts
these basins and is overlayed with the stream network created in the below
steps:
Basins
We evaluated a number of accumulation thresholds by comparing the stream
networks generated with this command to the USGS hydrography layer for
this area. We found that an accumulation of 7500 hundredths of an
inch matched the hydrography fairly well. The command did not deal
well with flat areas, especially the plateau to the northeast and the delta
near Los Angeles. The command line for creating the stream network is:
- GRID: stream7a = (flowacc,
7500)
Stream Network using a Threshold
of 7500
In RASTER format creating and displaying
the stream orders of the above coverage is a piece of cake. Simply
execute the following:
- GRID: strmorder = streamorder(
streamnet, flowdirgrid)
and display using:
- Gridpaint_linear strmorder
rainbow1 1 7
to show each order (1-7) having a different color
Raster Stream Order
Thats great but we need this in a VECTOR
coverage so:
Convert to Vector converage using:
- GRID: stream_cov = streamline( strmorder, flowdirgrid, order)
This created a vector cover with each arc having an item called order
associated with it containing the value of stream order.
The order command is not applicable for vector coverage, so in order
to present the data in a meaningful fashion resulted in the creation of
a lookuptable (lutb) were the colors of each stream order were specified.
- Tables: sel lutb
7 Records
Selected.
list
Record ORDER SYMBOL
1 1
1
2 2
28
3 3
43
4 4
61
5 5
89
6 6
110
7 7
126
This looked like shit ....so we tried using the options available in
SYMBOLDUMP. This provided a series of option with lines that got
wider and brighter as the values grew. This looked better but the color
for one (gray) was too close to fives (white) and so we changed the lutb
(look up table) to start with red 2.
Enter
Command: list
Record ORDER SYMBOL
1 1
2
2 2
3
3 3
4
4 4
5
5 5
6
6 6
7
7 7
8
again this was better but not perfect........
The commands for displaying this below image are:
- Arcplot: arcl stream_cov
order lutb
Vector Stream
Order
Another attempt was tried using the ARCSHADES.aml in atool to display
stream coverage by order:
Arcplot: Arcshades stream_cov
order 1 7
...but it was hard to distinguish between stream orders because the
colors were so similar. So a series of commands were used to select
each order of streams individually, then change the line color and line
size to reflect differences in order.
- ARCPLOT: RESEL stream_cov
line order = 1
- ARCPLOT: LINESIZE
0.010
- ARCPLOT: LINECOLOR
blue
- ARCPLOT: ARCS stream_cov
- ARCPLOT: NSEL stream_cov
line
- ARCPLOT: RESEL stream_cov
line order = 2
- ARCPLOT: LINECOLOR
violet
- ARCPLOT: ARCS stream_cov
These steps were repeated with all seven orders, using the colors blue
(1), violet (2), red (3), orange (4), yellow (5), chartreuse (6), and forestgreen
(7). Additionally the linesize was increased incrementally beginning
at order 4 using the LINESIZE command. Order 7 has a LINESIZE of
0.021.
This final version of VECTOR stream order is presented below:
|
blue (1), violet (2) red (3), orange (4), yellow (5), chartreuse (6), and forestgreen (7) |
![]() |
VECTOR Stream Order
Vector
Drainage Network as Mean Annual Discharge, Drainage Density
UNIT CONVERSION
Up to this point the units of rainfall that we
have been dealing with are in 100th of inches, which is a linear measurement.
To depict this information in a manner useful for showing "vector drainage
network, symbolized as a function of mean annual discharge" required converting
these units to a volume. We converted the mean annual precipitation
to cubic meters:
-
Hundredths of inch to inch divide by 100
-
Inches to centimeters multiply by 2.54
-
Centimeters to meters divide by 100
-
Meters to cubic meters multiply by 4,000 (200x200 cell size)
=
1.016 cubic meters * 100th of inches per 200x200 meter cell
Mean
Annual Discharge
We found that the original flow accumulation
grid had a majority of values as 0. Confirmed by using ArcView.
So re-ran fowaccumulation using the converted precipitation.
-
GRID: flowacc_m = flowaccumulation (flowgrid, precip_meters)
-
voldis = con (stream7a == 1, flowacc_m)
-
<voldis is volume discharge>
The spread of values was too large and difficult
to deal with so we altered the spread by using A/I to create a grid of
the square root fo the volume accumulation.
-
sqrtgrid = sqrt(voldis)
The spread of values ranged from 87.29 to 12,438.
The natural breaks seem to occur around each 1,000 unit change in values.
Using ArcView we confirmed the spread of values and constructed and a map
in A/I using the following commands to assign volume accumulation specific
display values.
Used the CON to select values based on the natural
breaks described above, we created a set of grids – each containing an
order of flow accumulation values 1 – 7. So, gird dis1 contains
flow accumulation cells of an order of one – all cell values are below
1,000 cubic meters.
From Grid:
dis1 = con(sqrtgrid < 1000, 1)
dis2 = con(sqrtgird <2000, con(sqrtgrid >1000),1)
dis3 = con(sqrtgrid <3000, con(sqrtgrid >2000),1)
dis 4 = con(sqrtgrid <4000, con(sqrtgrid >3000),1)
dis5 = con(sqrtgrid <5000, con(sqrtgrid >4000),1)
dis6 = con(sqrtgrid <6000, con(sqrtgrid >5000),1)
dis7 = con(sqrtgrid >6000,1)
Then we converted each of these dis# grids to vector coverages
grid: disline1 = gridline(dis1)
grid: disline2 = gridline(dis2)
...continued this process for all 7 dis# grids
In arcplot we used line color commands to display
the volume accumulation by the ordered values of 1 – 7.
From ARCPLOT:
arcs disline1
lincolor blue <cubic meters less than
1,000 squared >
arcs disline2
linecolor chartreuse <cubic meters between
1,001 and 2,000 squared)>
The below graphic depicts the Vector drainage
network as represented by mean annual discharge


Vector Drainage Network Symbolized as
Mean Annual Discharge
(Units are square root of cubic meters per year)
DETERMINING THE ERROSION IMPACTS
Deriving Slope
We grappled with a way to depict the erosion and how to assign this value. We conceptualized were we considered the most erosion would occur. We first constructed the slope.GRID: slopegrid = slope(filledca3)
![]()
Slope GridErosion
We then created a grid of erosion based on selecting flow accumulation values (1) to equal the slope multiplied by our grid of the square root flow accumulation values. This seemed yield results close to our assumption that the greatest erosion would occur along the step slopes, but we found that it was weighted heavily along the areas of highest flow accumulation.
- GRID: erosion = con(stream7a == 1, slopegrid * sqrtgrid)We determined that an equations was need to emphasize the effect of the slope and demphasize the effect of the discharge. Tried this equation
in GRID:
- erosion = con(stream7a ==1, (sqr(slopegrid)) * sqrtgrid)
The resulting grid meet our expectation.
![]()
Erosion GridDRAINAGE DENSITY
Drainage density is still being worked on but this is the start of the process....The below method was tried but did not work, and so we moved on...
Used DOCELL command
created variable called COUNT
The variable was used to derive the total number of cells for total cellsIN GRID:
count = scalar(0)
totalcells = scalar(0)GRID: docell
:: totalcells += 1
::if (stream7a > 0) {
::outgrid2 =1
::count +=1
::endNext we used the LINEDENSITY COMMAND:
This resulted in the following image:
![]()
Drainage Density overlayed with stream coverage
(White areas are high density, dark areas are low density)We also determined drainage density by basin. We summarized stream length by basin id, calculated density by divided total stream length per basin by basin area. This process was performed in ARC/VIEW using the summarize and jointable functions. Basins along the shore have the highest density, which underscores the fact that these are small linear watersheds. Interior basins all had similar low drainage densities. We hypothesize that erosion impacts will be greatest in those areas where drainage density is the highest, since stream length per unity area is greatest in these areas. See the pictures below for a summary.
![]()
Drainage Density by Basin
(see legend below)
![]()
Drainage Density by Basin (Closeup View)
Please note that due to space considerations, not all drainage density values are not depicted in the legend.To Determine Total Drainage Density, we calculated the Total area by multiplying the number of rows by the number of columns by the area of each cell. We could have discounted the ocean area in this count, but we decided to just assume that the ocean took 1/4 of the area.
Total area: 158,492 hectares
Total stream length: 30,016,585 metersDrainage Density = Stream length / Total area = 189 meters /hectare