100K DLG STDS extract program 1.0 Fix error if no "Readme" file is present in directory. 6/21/2000 1.2 Add error message if DosStartSessions fail to start. 11/9/2000 1.4 Fixed error in starting Show173; not patching state designation in and added error message into file when not correctly started and fixed the quote in a string error in places routine. 11/22/2000 1.5 Added logic to catch ARDF references that are larger than the number of records in the ARDF file. Also, raised the axxf limit to 135,000 to handle wlongisw. Added version number to log file. 12/07/2000 1.6 Added Hawaii (HI) to the states that can be extracted. Fix missing Highway Theme so it writes to log and continues instead of erroring out. 1/26/2001. 1.7 Changed the new record start to CRLF from XX. 2/17/2001 1.8 Added command line parameter (ST) for unattended operation and fixed problem that prevented more than 1 _stds_ file for same theme in subdirectory 2/24/2001. 1.9 Added code to log error message if HY01AHDR.DDF file cannot be found 3/19/2001. Theory of Operation: EXTRACT is a OS/2 PM program that will accept as input GZIP'd and TAR'd 100,000 :: 1 USGS DLG data files in STDS format; extract the needed data and build DEX (DLG Extension) files as output. EXTRACT also outputs MAP header files and a log file. It is currently capable of handling four 100K map themes, Boundary, Transportation, Hypography and Hydrology. The output files (other than the log file) are in the correct format for direct use by MOVEMAP. The program can be executed from either an OS/2 session or a desktop icon. Extract has a large memory footprint and uses very many processor cycles. On a 300 MHZ PII it has run for as long as 4 hours doing the extract and build on a single Hypography file set. Once pointed to a 100K state subdirectory EXTRACT will search all directories in the state for the appropriate USGS .GZ files. It will unzip the file (using the USGS provided DOS GZIP); untar the resulting file (using the USGS provided DOS TAR); build the .DEX files and then clean the directory of the untar'd files. EXTRACT will finish when it has processed all appropriate files in the all state subdirectories. It will leave the unzip'd files in the directory in case a rerun is required. In order to rerun EXTRACT against any particular SDTS file the unzip'd file must be rezip'd using the DOS GZIP utility. EXTRACT operates on the STDS .DDF files provided by first building large memory tables from all referenced files. Once the reference files are built EXTRACT will process the large line .DDF files one record at a time by producing a single output line record in the .DEX outfile that contains all the required reference information in a single record in a format usable by MOVEMAP. After processing the last line record EXTRACT will write a header record in the appropriate MAP*.DEX file. The header record contains information that MOVEMAP needs to locate and use the map section. Directory/File Naming Structure: EXTRACT expects to find a very unique directory structure. Operation is unperdictable (but incorrect) if the required directory structure is unavailable. The required input structure is: a:\100K\SS\AAAAAAAD\BBBBTTNN.GZ Where: a is any hard drive. SS is a two letter state designation; eg. CA for california, NC for North Carolina, etc. AAAAAAA is a seven character map section subdirectory. The USGS data sets cover 1 degree of latitude by 1/2 degree of longitude and there are EAST and WEST sections of each. I give the EAST and WEST sections the same seven character name. D Must be either E for EAST map section or W for west map section. BBBB is a four character map section identifier. I used to use the old USGS map section designation (SD1 for a section near San Diego, etc.). It's up to you what to use here. TT must be a two character theme designator. BD for boundary, HY for hydrology, HP for hypograph and TR for transportation. NN is a two digit field that can be set to any number you want. The output sturcture produced by EXTRACT is: a:\100k\SS\AAAAAAAD\BBBLLDKK.DEX Where: a is the same hard drive SS is the same state subdirectory as the input. AAAAAAAD is the same map section subdirectory as the input. BBB are the first three characters of the USGS input file. LL is a two character extracted theme designation. The themes are expanded from the original four input themes. They are HY, WA, RR, RD, MT, HP, BD. D is either E for east section or W for west section. KK is a two digit map subsection identifier. The map header file directory structure is: a:\100K\MAPLLXXX.DEX Where: a is the same hard drive. LL is the same two character extracted theme designation as above. Extract also produces a set of backup files: a:\100K\MAPLLXXX.BAC. These files are copies of the associated a:\100K\MAPLLXXX.DEX file just prior to processing the input file with a similar theme association. The input to output theme association is: HY -> HY, WA HP -> HP TR -> RR, RD, MT BD -> BD The Program Bundle: You'll will find the following files in the EXTRACT bundle: README.TXT This file. EXTRACT.EXE The executable. BPMCC.DLL Extract uses Borland controls (just like MOVEMAP) so it needs to find this file in the \OS2\DLL directory or it won't run. TAR.EXE DOS TAR executable available on the USGS web site (see below). GZIP.EXE DOS GZIP executable available on the USGS web site. SHOW173 OS/2 version of USGS software to write a text output file from a DDF input file. Operation: 1. Unzip the program bundle on the same hard drive that you will be building your 100K directory. EXTRACT will only look on the hard drive that it's on for the input map files. It will also look for the TAR and GZIP exectuables in the \BUILD subdirectory on this same drive. Also, move BPMCC.DLL to the \OS2\DLL directory. 2. Build a map directory on this hard drive as described above. \100K\SS\... making EAST and WEST subdirectories for each set of map section themes you expect to download from the USGS web site. At this point there should be no files in any directory; just a nice directory tree. 3. Point your browser to the USGS Map site (http://www-nmd.usgs.gov/) and look for "Downloadable data" and under that "US GeoData." Click on it. On the next page click on the 1:100,000 header at the top of the page. Then click on FTP via Graphics. When this page loads you should be looking at a graphical representation of the lower 48. Click on the general location you've set up your directory tree to handle. You should now be looking at an enlarged representation of the same section with a matrix of lines and map section names. Click anyware in one of the retangular boxes and you should be give a selection of themes. Click on one. You will now be given a choice of East or West map section. Pick one and you will finally have the USGS map data directory. The directory has UNIX names. The file you're intested in is the one (usually only one) that has sdts somewhere in its name. Click on this one. If there is more than one file with an sdts in its name download them all but make sure they have unique names. When your browser askes you what to name it and where to put it follow the directions above for the directory/file naming required; eg. the downloaded file for a boundary theme for the western section for a map in NW California might go in the e:\100K\CA\CDERVILW directory and be given the name CA10BD01.GZ. 4. After you've downloaded all the USGS files you care to in one sitting execute EXTRACT from the \BUILD directory. Click on File->New. When the dialgue opens select the appropriate State and click on DONE. 5. Sit back and wait. It could take from minutes to hours to days depending on how many files you've downloaded and how fast a processor you have. You can watch pulse and see what's going on. You should see DOS windowed session open, run and close in the background. And you can see the XTRCTLOG.TXT file in the \BUILD directory file grow in size by doing a dir. The program is loaded with error messages if something goes wrong. Most of the error messages are non-recoverable. The EXTRACT will stop running right where it is. If this happens copy down the error messsage exactly; give your \100K\MAPLLXXX.BAC files a new file extension name and save your XTRCTLOG.TXT file. If you email me this stuff I'll try to figure out what's wrong and fix it. I wrote most of this program long ago when I was just learning C. I set up all the data structures at the beginning of the program and as they're loaded I check the available space in each structure. It is possible to run out of space. If this happens the error message will tell me exactly what to increase and I can send a new version. If this happens it's most likely to happen in large metropolitan areas where the map data gets very dense. I've extracted for St. Louis and Washington DC; but other large cities may bust these limits. You'll need plenty of hard drive space to run and store the results. Just the unzip'ing and untar'ing can more than triple the space required from the original download so don't build your \100K directory tree on a tight hard drive or you'll be sorry. When the pulse signature drops and stays down for a bit the extract and build is done. To confirm you can do a "type \build\xtrctlog.txt | more" and look at what's in the log file. Each message is time stamped. There should be start and end messages and each time a header recorded is written a message is added. There are also a small number of recoverable errors that cause entries in the log file. 6. If you've done it right you can now crank up MOVEMAP with a GPS reading in the area covered by your map data and you should get a 100K map drawn. The only thing missing from the map will be the names designations. This data is also available from the USGS on a CD-ROM. It must also be run throught EXTRACT to build the appropriate file. Some years ago I bought the Digital Gazetteer from the USGS. It contains a Geographical Names DB. It comes with software that allows you to search for all Names (with lat. lon designation) by rectangular area. I do this for each east and west map section. The software also allows for exporting the results in, among other formats, a comma delimited format. The comma delimited output is run through EXTRACT to produce an appropriately formated .DEX file and stored in the same subdirectory as the other .DEX files for the same map section. The naming convention for the common delimited input file is BBBMSXXX.DDX. Where BBB is the same three character indentifier as the other input maps in the associated map section. EXTRACT also uses a slightly modified version of the USGS provided SHOW173 software to build a text version of the HY01AHDR.DDF file in each subdirectory. The lat and lon of the corners of the map sections are in this file so use your favorite text editor to find them and use them in building your comma delimited files. Steps: A. Execute GNIS.BAT from a DOS command line. I've never been able to get this to run from a DOS box in OS/2. I've always had to use a real DOS session in Windows. I believe it has to do with the CD-ROM drivers. When the first screen comes up select "National Geographic Names Data Base" from the menu. B. You should now be looking at a DB search screen. Down arrow to the "Geographic Coordinates" line and enter the appropriate rectangular search coordinates; eg. for the East Long Island east rectangle it's "403*..4059* and (40*N0720*,40*N0721*, 40*N0722*)". Press enter. C. When the search completes go to the Display screen from the top menu. D. Select Actions (F4) from the menu and then "Export" E. You should now be looking at a drop down menu for the display option. Enter the following: Export Type - Comma Delimited Filename - h:\100k\ny\elongise\142sxxx.ddx (for the elongise example only) Fields to Include - Feature Name, Feature Type, Geographic Coordinates, Decimal Latitude (Source), Decimal Longitude (Source) All other fields should be left unchecked. What Range - All found documents. Start Export - Yes. F. Hit enter. When complete you should have the appropriate *.ddx file in the designated subdirectory. G. Run Extract pointing it toward the state directory with the *.ddx files you want extracted to *.dex files. 10/5/2006 Tom Danninger, porteralexander@nc.rr.com