Identifying the correct geolocation tile data

Data formats, HDF5, XML profiles, etc.
Post Reply
houchin
Posts: 128
Joined: Mon Jan 10, 2011 6:20 am

Identifying the correct geolocation tile data

Post by houchin »

Hi all,

Is there a way somewhat simple way, from information I can easily determine from metadata within the Science RDR or Spacecraft diary file, which Geolocation tile and like data is needed to process that image through the VIIRS SDR Geolocation algorithm.

I asked a few folks this question by e-mail, and unfortunately no one had an answer. So what I am currently doing is passing the entirety of the tile database into ADL by listing those input directories in my inputPath. That adds over 66K additional files as inputs, and I have seen the warning in the ADL documentation that more than 1000 files will cause some slowdowns.

I have segmented out the VIIRS Verified RDR algorithm just as a quick baseline. Without those additional input directories, the algorithm completes in about 30 seconds. With those directories, it's over 9 minutes. Obviously, I would like to avoid a slowdown of that magnitude.
Scott Houchin, Senior Engineering Specialist, The Aerospace Corporation
15049 Conference Center Dr CH3/310, Chantilly, VA 20151; 571-307-3914; scott.houchin@aero.org
kbisanz
Posts: 280
Joined: Wed Jan 05, 2011 7:02 pm
Location: Omaha NE

Re: Identifying the correct geolocation tile data

Post by kbisanz »

Unfortunately, I am not aware of an easy way to determine what tiles are needed for a given granule based on the RDRs.

If you're only running through Geolocation and not the Gridded IP algorithms later on in the VIIRS SDR controller, you do not need any of the GIP tiles. You should only need Terrain-Eco-ANC-Tile files. I believe there are 1664 of those files. Removing the other GIP tiles (assuming you're not running the GIP code) should speed things up.

Once you have a geo product, you can look at the .asc file for an output (such as VIIRS-MOD-GEO). Inside of the .asc file is metadata for N_Anc_Filename. It will list the Terrain-Eco-ANC-Tile files used, among other ancillary files. The tile ID is located near the end of the file name. For example, the tile IDs are N0367 and N0368 in this snippet:

Code: Select all

    ("N_Anc_Filename" STRING EQ "Terrain-Eco-ANC-Tile_20030125000000Z_ee00000000000000Z_NA_NA_N0367_1.O.0.0")
    ("N_Anc_Filename" STRING EQ "Terrain-Eco-ANC-Tile_20030125000000Z_ee00000000000000Z_NA_NA_N0368_1.O.0.0")
The above info about N_Anc_Filename won't help you the first time, but perhaps it could be used to make rerunning the granule faster.

Another thing which might speed things up is to reduce the debug messages by running on debug MED or HIGH. You could also turn it off debug all together. Of course if you have problems, you'll want debug messages turned on.
Kevin Bisanz
Raytheon Company
kbisanz
Posts: 280
Joined: Wed Jan 05, 2011 7:02 pm
Location: Omaha NE

Re: Identifying the correct geolocation tile data

Post by kbisanz »

Determining the tile IDs required for gridding algorithms is a bit trickier. Again you need to have run the algorithm once to determine the list. First, locate VIIRS-Grid-To-Gran-GridIP-AW-SWATH-Mod-IP for the granule in question. Then execute

Code: Select all

od -A d -v -t u1 -j 196608000 -N 5184 -w1 $PathToFile | grep 1$
The 196608000 is the offset to the tileList field in the VIIRS-Grid-To-Gran-GridIP-AW-SWATH-Mod-IP file. This was determined by viewing $ADL_HOME/xml/VIIRS/VIIRS_GRID_TO_GRAN_GRIDIP_AW_SWATH_MOD_IP.xml in a browser and looking at the table near the bottom.

The above will output something like this:

Code: Select all

196608835   1
196608836   1
196608837   1
196608838   1
196608906   1
196608907   1
196608908   1
196608909   1
196608910   1
196608911   1
196608978   1
196608979   1
196608980   1
196608981   1
196608982   1
196608983   1
196609050   1
196609051   1
196609052   1
196609122   1
196609123   1
There are 5184 possible tile IDs used for GIP code. The tileList field is an array of 5184 elements. If the array element is 1, it means the tile ID of that index is required for this granule. A 0 means it's not required. So, if tileList[836] is set to 1, it means that tile ID 836 is required for whatever tile types are listed in the *_CFG.xml file.

Because the numbers in the first column of the od output are offsets from the beginning of the file, you need to subtract the field offset (196608000) to get the index of the field. So, 196608836 - 196608000 = 836. This means that tile ID 836 is required. Using

Code: Select all

od -A d -v -t u1 -j 196608000 -N 5184 -w1 $file | grep 1$ | perl -p -e 's/(\d+)\s.*/$1 - 196608000/e'
should do the subtraction automatically and output something like

Code: Select all

835
836
837
838
906
907
908
909
910
911
978
979
980
981
982
983
1050
1051
1052
1122
1123
The above output would indicate that those ~20 tile IDs were needed for this particular granule. The collection short names needed are whatever is specified in the algorithms' *_CFG.xml files.
Kevin Bisanz
Raytheon Company
ljiang
Posts: 58
Joined: Mon Jul 11, 2011 10:57 pm

Re: Identifying the correct geolocation tile data

Post by ljiang »

for the geolocation tiles, does anyone know where to download all the Terrain-Eco-ANC-Tiles ? thanks!
Lide Jiang
CIRA @ NOAA/NESDIS/STAR
kbisanz
Posts: 280
Joined: Wed Jan 05, 2011 7:02 pm
Location: Omaha NE

Re: Identifying the correct geolocation tile data

Post by kbisanz »

For the tiles required by geolocation, I think you want this: https://jpss.ssec.wisc.edu/adl/download ... aLinks.tgz

For the tiles required by the GIP algorithms, I think you want this: https://jpss.ssec.wisc.edu/adl/download ... aLinks.tgz
Kevin Bisanz
Raytheon Company
kbisanz
Posts: 280
Joined: Wed Jan 05, 2011 7:02 pm
Location: Omaha NE

Re: Identifying the correct geolocation tile data

Post by kbisanz »

Once you download and extract the tiles, multiple input paths can be used by separating them with a colon, just like the $PATH variable on Linux. Environment variables can also be used, but must be surrounded by curly braces. For example: “<inputPath>/home/user/inputs:${ADL_HOME}/data/inputs</inputPath>”.

Note that adding both sets of tiles will greatly increase the number of inputs and will likely make your algorithm run slower. It should still complete successfully though.

Once the algorithm has been executed once, you can use the methods discussed earlier in this thread to determine which tiles are really needed. Then you only need to put those tiles in your input directory.
Kevin Bisanz
Raytheon Company
ljiang
Posts: 58
Joined: Mon Jul 11, 2011 10:57 pm

Re: Identifying the correct geolocation tile data

Post by ljiang »

kbisanz wrote:Once you download and extract the tiles, multiple input paths can be used by separating them with a colon, just like the $PATH variable on Linux. Environment variables can also be used, but must be surrounded by curly braces. For example: “<inputPath>/home/user/inputs:${ADL_HOME}/data/inputs</inputPath>”.

Note that adding both sets of tiles will greatly increase the number of inputs and will likely make your algorithm run slower. It should still complete successfully though.

Once the algorithm has been executed once, you can use the methods discussed earlier in this thread to determine which tiles are really needed. Then you only need to put those tiles in your input directory.

Thanks for the links and tips.
Lide Jiang
CIRA @ NOAA/NESDIS/STAR
Post Reply