Contents || Index
03/25/2010

Large Data Cubes

Most data cubes now are large, and read into Lispix as f-cubes - file based data rather than memory based data.  Even if the data cubes are small enough to be kept in memory, I recommend loading them as f-cubes (Lispix gives you this option) because it is fast and keeps Lispix memory use low.  Lispix will likely soon open data cubes only as file-based cubes.  Pre-processing is required, but only once, and can be done in 'batch' mode.  File cubes require twice the storage space on your hard drive.  PCA does not yet work with file-based cubes.


With the development of fast SDD detectors, spectral images are becomming too large to completely hold in memory, even on 1 Gb+ machines.  Up to this point, Lispix has stored the entire data cube in memory, and provided some tools to clip, crop, or truncate cubes that are too large, so that they will fit.  This has become increasingly impractical.  Lispix now handles large data cubes using file-based methods, where the disk file is treated somewhat like very large computer memory.  Open these files in the normal way, using File / Open in the Lispix Menu Bar, and use the normal data cube sliders and tools with them.  Note that the old bottom row of buttons in the data cube tool that provided manual file-based means of clipping and cropping, has been eliminated.

Old
New

(You can now choose your button color scheme.)

 

The in-memory data cubes are called image-cubes and vector-cubes, denoting how the data is internally represented in Lispix.  Lispix now has new type of cube called a file-cube.  This will display data cubes of any forseeable size.  The small title starts with "f-cube".    If they have not already been preprocessed, this will then be done before the cube is displayed.  You can preprocess a number of files in batch mode using **Data Cube** / Read / Preprocess Large Data Cubes.

Preprocessing File Cubes

When you open a data spectral image for the first time, Lispix presents this dialog:

 

Preprocessing primarily involves making a copy of the original data cube, but stored in different sequence.  If the original cube is recorded vector-by-vector, the copy is stored image-by-image and has "-xim" appended to the name.  Making this "transposed" copy of the data is both memory and time intensive.  I recommend preprocessing files in batch mode, so that you don't have to wait, and then restarting Lispix to return most of the memory back to the operating system.  The preprocessing can be interrupeted:  the next time the batch processing is run, it will start over on the file.  While the copy of the cube is being generated, the file has a .tmp extension.  It is renamed to .raw when preprocessing is complete. 

To "batch" pre-process your data cubes, use

  1. **Data Cube** / Open / Preprocess Large Data Cubes
  2. Select a top level file in whatever folder contains all of the data cubes that you wish to preprocess.  Lispix will search the folder and all sub-directories, preprocessing the data cubes as it finds them.  The progress bar at any time is for the data cube being preprocessed at that time.
  3. Select the number of slices to preprocess at once.  200 is the default, which works for my networked drive.  800 works for my local drive. See Note.

You can interrupt the preprocessing at any time by closing the progress bar.  Lispix will continue to "crunch" the cube, as can be seen by the indicator for your hard drive, until the next update of the progress bar occurs, at which time Lispix will say "Cancelled by closing loop monitor."  If you wish to halt preprocessing immediately, you can try right-clicking on the small blue Franz Lisp portrate/icon near the clock at the lower-right of your screen and selecting Interrupt Lisp, or by using Task Manager / Applications / Lx--- / End Task.  Interrupting the preprocessing will leave a .tmp file, which Lispix will delete the next time it preprocesses that file, or which you may trash. 

File Overhead

In order respone quickly, often faster than with the memory-based cubes, Lispix stores copies and summaries of the data in additional files.  You can open the data by clicking either on the original data file, or on the copy.
Here is an example.  "Raney small.spd" is the original data.  All the other files are auxilliary files made by the preprocessing step.
  1. Original Data:  Raney small.spd
    Contents of the "... -LXD" folder

    The -xim.raw file is the duplicate of the .spd file stored in image order.  It lacks the 2 kb header.  If this duplicate were in vector or spectrum order, then the name would have been "Paney small-xsp".

    The -LXD folder (Lispix Data) contains raw files of various spectra.


Note:

The number of slices to preprocess at one time controls the amount of memory that Lispix tries to grab for making the mirror cube file.  The more memory it gets, the faster the preprocessing will go, but there are limits.  When you pen the cube manually, with File / Open, Lispix will ask for the number of slices to process, and give you the maximum that it thinks is available.  Asking for signifigantly more than that will result in Lispix erroring out. 

If the data is on a networked drive, Lispix will error out even when the number of slices is about a third of this maximum: 

In any case, asking for too much memory will slow down your computer and other applications that you are running.  You need to quit Lispix to free the memory.  Lispix needs very little memory to load the cubes (file mode) once they have been pre-processed.

For a cube of 1024 slices of 512 x 384 x 1 byte each, on my machine, 200 or 300 slices at a time work for a network drive, and 800 or a little more work for my local "C" drive.