NAME
ftselect - Copy selected rows from the input table to a new output table.
USAGE
ftselect infile[ext] outfile expression
DESCRIPTION
ftselect creates a filtered copy of the input table containing only the
subset of rows that satisfy a user-specified set of conditions. The
Boolean expression may be arbitrarily complex and is typically a
function of the values in one or more columns in the table. The
expression is evaluated on a row by row basis, and if it evaluates to
TRUE (not equal to zero) then that row is copied to the output table.
The output file will by default include a copy of any other extensions
in the input FITS file, but if 'copyall = NO' then the output file will
contain only the filtered table extension, and the required null
primary array.
Note that the 'ftcopy' task can perform the identical row
selection operations as ftselect, as shown in the examples. The main
difference is that the syntax for selecting rows with ftselect is
easier to use, and ftselect has no limitation on the size of the output
table, unlike ftcopy where the output table must first be constructed
completely in memory before being written to the output file.
See the 'calc_express' help file for a full description of the
allowed expression syntax. In general, the expression may contain any
of the following elements:
- Column and keyword names:
- The value of the column (in the current row) or the keyword will
be substituted into the expression. Precede keyword names with a
single pound sign, '#', if necessary to distinguish it from a column
with the same name. If the column or keyword name contains a space or
a character that could be interpreted as a mathematical operator
(or the special case of a column or keyword named upper or lower case "T" or "F"),
enclose the name in dollar sign characters, as in the column $MAX-PHA$,
or the keyword #$DATE-OBS$. NOTE: column names that begin with the
letters 'b', 'o', or 'h' followed by numeric digits must be enclosed in
'$' characters, otherwise they will be interpreted as numeric constants
in binary, octal, or hex notation, respectively.
To use a table
entry in a row other than the current one, follow the column's name
with a row offset within curly braces. For example, 'PHA{-3}' will
evaluate to the value of column PHA, 3 rows above the row currently
being processed. One cannot specify an absolute row number, only a
relative offset. Rows that fall outside the table will be treated as
undefined, or NULLs.
To use a single element of a vector column within a calculator expression,
follow the column's name with a element number (beginning with 1 for
the first element) within
square brackets, as in 'PHAS[1] * 10'. Note that this expression
evaluates to a scalar quantity (a single number), whereas the expression
'PHAS * 10' evaluates to a vector that is created by
multiplying each element of the PHAS vector by 10.
- Mathematical operators:
- +, -, *, /, ** or ^ (exponentiation)
- Boolean operators in C or Fortran-type notation:
- .eq., ==, .ne., !=,
.lt., <, .le., <=, .gt., >, .ge., >=, .or., ||, .and., &&, .not., !,
and ~ (approximately equal, to within 1E-07)
- Math library functions:
- abs(x), cos(x), sin(x), tan(x), arccos(x), arcsin(x), arctan(x), arctan2(y,x),
cosh(x), sinh(x), tanh(x), round(x), floor(x), ceil(x)
exp(x), sqrt(x), log(x), log10(x), x%y (modulus), erf(x), erfc(x), gamma(x)
random() (returns random number >= 0 and < 1),
randomn() (returns Gaussian distribution with zero mean and unit
standard deviation),
randomp(x) (returns a Poisson random distribution whose expected
number of counts is X. X may be any positive real number of expected
counts, including fractional values, but the return value is an integer.)
min(x,y), max(x,y),
accum(x), seqdiff(x), angsep(ra1, dec1, ra2, dec2) (all in degrees).
- Numerical constants:
- Numeric values are assumed to be in decimal notation. Integer
constants may also be in hexidecimal (0x123f7), octal (0o12737) or
binary (0b1001010). Such integer constants are 32-bit integers.
The following predefined constants may also be
used: #pi (3.1415...), #deg (#pi/180), #e (2.7182...), #row
(substitutes the current row number into the expression). Two special
constants, #null and #snull, can be used for testing if the expression
value has an undefined numeric value or an undefined string value,
respectively.
- String constants:
- enclose string values in quotes, e.g., 'Crab', 'M101'
- Datatype casts to convert reals to integers or integers to reals:
- (int) x, (float) i
- Conditional expressions:
- 'b?x:y' where expression 'x' is evaluated if
'b' is TRUE (not equal to zero), otherwise expression 'y' is evaluated.
- Test for near equality:
- near(value1, value2, tolerance) returns 0 if
value1 and value2 differ by more than tolerance.
- Bit masks:
- Bit masks may be used in logical expressions, and be of arbitrary
length. Bit masks may be binary (b110001), octal (o44712) or hexidecimal
(h0f3D). The 'x' character represents a wild card at that position.
- Good time interval test:
- This function returns 1 if the time value
lies within one of the good time intervals, otherwise it returns 0.
Specifying 'gtifilter()' is equivalent to 'gtifilter("", TIME,
"*START*", "*STOP*")' and uses the GTI extension in the current FITS
file to filter the TIME column using the START and STOP columns in the
GTI extension. The gtifind() function takes the same arguments
as gtifilter() but returns the GTI entry number that brackets
the time sample instead of true/false (or returns -1 if no GTI
brackets the time sample).
- Good time interval overlap calculation:
- The gtioverlap() function returns the amount of overlap
time between a user requested time range and a GTI extension.
See calc_express for more
information.
- Spatial region test:
- This function returns 1 if the spatial position associated
with that row of the table is located within the region defined by
the specified region file. Specifying 'regfilter("region.reg", xpos, ypos)'
uses the xpos and ypos table columns (and any associated World Coordinate
System keywords) to determine the position, and the region file named 'region.reg'.
- Vector column operators:
- These funcions operate on a vector and return a scalar result:
MIN(V), MAX(V), AVERAGE(V), MEDIAN(V), SUM(V), STDDEV(V),
NELEM, and NVALID(V) (number of non-null values), NAXIS(V)
and NAXES(V,n).
See the 'calc_express' help file for more information.
PARAMETERS
- infile [filename]
- File name and optional extension name or number enclosed in square
brackets of the input table whose rows will be selectively copied
(e.g., 'file.fits[events]'). If an explicit extension is not specified
then the first table extension in the file that is not a GTI (Good Time
Interval) extension will be used. Additional filter specifiers can be
appended to the file name, also enclosed in square brackets, to create
a virtual input table as shown in some of the examples.
- outfile [filename]
- Name of the output file that will contain the selected rows from
the input file. Any other HDUs in the input file, in addition to the
table that is being filtered, will also be copy to the output file
verbatim if copyall = YES. To overwrite a preexisting file with the
same name, prefix the name with an exclamation point '!' (or '\!' on
the Unix command line), or else set the 'clobber' parameter = YES.
- expression [string]
- The boolean expression used to select rows. If the expression
evaluates to zero than that row will not be copied. A text file
containing the expression can be specified by preceding the
filename with the '@' character, as in '@file.txt'. The expression in
the file can be arbitrarily complex and extend over multiple lines of
the file. Lines that begin with 2 slash characters ('//') will be
ignored and may be used to add comments to the file.
- (copyall = YES) [boolean]
- If copyall = YES (the default) then all other HDUs in the input file
will also be copied, without modification, to the output file. If
copyall = NO, then only the single table HDU specified by infile will be
copied to the output file along with the required null primary array.
- (clobber = NO) [boolean]
- If outfile already exists, then setting "clobber = yes" will cause it to be overwritten.
- (chatter = 1) [integer, 0 - 5]
- Controls the amount of informative text written to standard output.
Setting chatter = 3 or greater will display the number of rows
that were selected. Setting chatter = 5 will also
produce detailed diagnostic output.
- (history = NO) [boolean]
- If history = YES, then a set of HISTORY keywords will be written
to the header of the output table to record the value of all the ftselect
task parameters that were used to produce the output file.
EXAMPLES
Note that when commands are issued on the Unix command line, strings
containing special characters such as '[' or ']' must be enclosed in
single or double quotes. In each example the equivalent
ftcopy command that would performs the same operation is also shown.
1. Select all the rows from the input events table that have
a PI column value between 12 and 200, inclusive.
ftselect 'input.fits[events]' outfile.fits 'PI > 11 && PI < 201'
ftcopy 'input.fits[events][PI > 11 && PI < 201]' outfile.fits
2. Extract rows 125 through 175 from the first table extension in FITS
file named manyrows.fits and write the result to a FITS file named
fewrows.fits:
ftselect manyrows.fits fewrows.fits '#row >= 125 && #row <= 175'
ftcopy manyrows.fits[#row >= 125 && #row <= 175] fewrows.fits
3. Select rows in which the 'counts' column value divided by the
'EXPOSURE' keyword is greater than 0.1:
ftselect rate.fits out.fits 'counts / #exposure > 0.1'
ftcopy rate.fits[counts / #exposure > 0.1] out.fits
4. Create a virtual input table containing a new 'Rate' column that is
calculated on the fly by dividing the 'counts' column by the 'time'
column, then copy only those rows that have a positive 'Rate' value.
ftselect 'in.fits[col Rate=counts/time;*]' out.fits 'Rate > 0'
ftcopy 'in.fits[col Rate=counts/time;*][Rate > 0]' out.fits
5. Filter the events extension, by copying only those rows
that have a 'TIME' column value that falls within one of the
Good Time Intervals as specified in the GTI extension contained
in the same input file as the the events extension.
ftselect 'in.fits[events]' out.fits 'gtifilter()'
ftcopy 'in.fits[events][gtifilter()]' out.fits
6. Same as the previous example, except that this time the GTI
extension is contained in a file called 'mygti.fits'.
ftselect 'in.fits[events]' out.fits 'gtifilter("mygti.fits")'
ftcopy 'in.fits[events][gtifilter("mygti.fits")]' out.fits
SEE ALSO
ftcopy,
ftcalc,
ftdelrow,
ftsort,
filenames,
colfilter,
rowfilter,
fv, the interactive FITS file editor, can also be used to select rows
in a table.
The design of this task is based on fcalc
in the ftools package and on the CXC dmtcalc task.
LAST MODIFIED
March 2002