Numerical analysis routines¶

Contents:

My module for various data analysis tasks.

REQUIREMENTS: numpy, Plotting and analysis tools (for errxy())

2008-07-25 16:20 IJC: Created.

2009-12-08 11:31 IJC: Updated transit flag in planet objects and
rveph() function.

2010-08-03 15:38 IJC: Merged versions.

2010-10-28 11:53 IJMC: Updated documentation strings for Sphinx;
moved pylab import inside individual function.
2011-04-13 14:29 IJMC: Added keyword functionality to fmin()
(taken from scipy.optimize).

2011-04-14 09:48 IJMC: Added a few fundamental constants.

2011-04-22 14:48 IJMC: Added trueanomaly() and
eccentricanomaly().
2011-06-21 13:15 IJMC: Added get_t23() and get_t14() to
planet objects.
analysis.allanvariance(data, dt=1)[source]

Compute the Allan variance on a set of regularly-sampled data (1D).

If the time between samples is dt and there are N total samples, the returned variance spectrum will have frequency indices from 1/dt to (N-1)/dt.

analysis.amedian(a, axis=None)[source]

Median the array over the given axis. If the axis is None, median over all dimensions of the array.

Think of this as normal Numpy median, but preserving dimensionality.

analysis.binarray(img, ndown, axis=None)[source]

downsample a 2D image

Takes a 1D vector or 2D array and reduce resolution by an integer factor “ndown”. This is done by binning the array – i.e., integrating over square blocks of pixels of width “ndown”

If keyword “axis” is None, bin over all axes. Otherwise, bin over the single specified axis.

Note that ‘ndown’ can also be a sequence: e.g., [2, 1]

EXAMPLE: ```[img_ds]=binarray(img,ndown) ```
analysis.cheby2poly(cin)[source]

Convert chebychev coefficients to ‘normal’ polyval coefficients .

INPUT: chebyt coefficients poly coefficients (e.g., for use w/polyval) poly2cheby(), gpolyval(); scipy.special.chebyt
analysis.confmap(map, frac, **kw)[source]

Return the confidence level of a 2D histogram or array that encloses the specified fraction of the total sum.

INPUTS: map : 1D or 2D numpy array Probability map (from hist2d or kde) frac : float, 0 <= frac <= 1 desired fraction of enclosed energy of map ordinate : None or 1D array If 1D map, interpolates onto the desired value. This could cause problems when you aren’t just setting upper/lower limits.... dumbconf() for 1D distributions
analysis.dopspec(starspec, planetspec, starrv, planetrv, disp, starphase=, []planetphase=, []wlscale=True)[source]

Generate combined time series spectra using planet and star models, planet and star RV profiles.

D = dopspec(sspec, pspec, sRV, pRV, disp, sphase=[], pphase=[])

INPUTS: OPTIONAL INPUTS: sspec, pspec: star, planet spectra; must be on a common logarithmic wavelength grid sRV, pRV: star, planet radial velocities in m/s disp: constant logarithmic dispersion of the wavelength grid: LAMBDA_i/LAMBDA_(i-1) sphase, pphase: normalized phase functions of star and planet. The inputs sspec and pspec will be scaled by these values for each observation. wlscale: return relative wavelength scale for new data
NOTE: Input spectra must be linearly spaced in log wavelength and increasing:
that is, they must have [lambda_i / lambda_(i-1)] = disp = constant > 1

Positive velocities are directed AWAY from the observer.

analysis.doubleGaussian(p, x)[source]

Compute the sum of two gaussian distributions at the points x.

p is a six- or seven-component sequence:

y = [p6 +] p0/(p1*sqrt(2pi)) * exp(-(x-p2)**2 / (2*p1**2)) +
p3/(p4*sqrt(2pi)) * exp(-(x-p5)**2 / (2*p4**2))

p[0] – Area of gaussian A

p[1] – one-sigma dispersion of gaussian A

p[2] – central offset (mean location) of gaussian A

p[3] – Area of gaussian B

p[4] – one-sigma dispersion of gaussian B

p[5] – central offset (mean location) of gaussian B

p[6] – optional constant, vertical offset

NOTE: FWHM = 2*sqrt(2*ln(2)) * p1 ~ 2.3548*p1

analysis.doubleGaussianCen(p, x, mu1, mu2)[source]

Compute the sum of two gaussian distributions at the points x. The distributions have central moments mu1 and mu2.

Useful for fitting to partially blended spectral data.

p is a four- or five-component sequence:

y = [p6 +] p0/(p1*sqrt(2pi)) * exp(-(x-mu1)**2 / (2*p1**2)) +
p3/(p4*sqrt(2pi)) * exp(-(x-mu2)**2 / (2*p4**2))

p[0] – Area of gaussian A p[1] – one-sigma dispersion of gaussian A p[2] – Area of gaussian B p[3] – one-sigma dispersion of gaussian B p[4] – optional constant, vertical offset

mu1 – central offset (mean location) of gaussian A mu2 – central offset (mean location) of gaussian B

NOTE: FWHM = 2*sqrt(2*ln(2)) * p1 ~ 2.3548*p1

analysis.dumbconf(vec, sig, type='central', mid='mean', verbose=False)[source]

Determine two-sided and one-sided confidence limits, using sorting.

INPUTS: OPTIONAL INPUTS: vec : sequence 1D Vector of data values, for which confidence levels will be computed. sig : scalar Confidence level, 0 < sig < 1. If type=’central’, we return the value X for which the range (mid-X, mid+x) encloses a fraction sig of the data values. type=’central’ – ‘upper’, ‘lower’, or ‘central’ confidence limits mid=’mean’ – compute middle with mean or median confmap() for 2D distributions ```from numpy import random from analysis import dumbconf x = random.randn(10000) dumbconf(x, 0.683) # ---> 1.0 (one-sigma) dumbconf(3*x, 0.954) # ---> 6.0 (two-sigma) dumbconf(x+2, 0.997, type='lower') # ---> -0.74 dumbconf(x+2, 0.997, type='upper') # ---> 4.7 ```

Some typical values for a Normal (Gaussian) distribution:

type confidence level
one-sigma 0.6826895
2 sigma 0.9544997
3 sigma 0.9973002
4 sigma 0.9999366
5 sigma 0.9999994
analysis.eccentricanomaly(ecc, manom=None, tanom=None, tol=1e-08)[source]

Calculate (Keplerian, orbital) eccentric anomaly.

One optional input must be given.

INPUT: OPTIONAL_INPUTS: ecc – scalar. orbital eccentricity. manom – scalar or sequence. Mean anomaly, equal to 2*pi*(t - t0)/period tanom – scalar or Numpy array. True anomaly. See
analysis.egaussian(p, x, y, e=None)[source]

Compute the deviation between the values y and the gaussian defined by p, x:

p is a three- or four-component array, list, or tuple.

Returns: y - p3 - p0/(p1*sqrt(2pi)) * exp(-(x-p2)**2 / (2*p1**2))

if an error array, e (typ. one-sigma) is entered, the returned value is divided by e.

analysis.egaussian2d(p, x, y, z, w=None)[source]

Return the error associated with a 2D gaussian fit, using gaussian2d.

w is an array of weights, typically 1./sigma**2

analysis.error_dropoff(data)[source]

Calculates the dropoff in measurement uncertainty with increasing number of samples (a random and uncorrelated set of data should drop of as 1/sqrt[n] ).

E(0), the first returned element, always returns the uncertainty in a single measurement (i.e., the standard deviation).

EXAMPLE: Compute the errors on an array of 3 column vectors ```data = randn(1000,3) e = error_dropoff(data) plot(e[1], 1./sqrt(e[1]), '-', e[1], e[0], 'x') xlim([0,30]) legend(['Theoretical [1/sqrt(N)]', 'Computed errors']) ```
analysis.fftfilter1d(vec, bandwidth, retfilter=False)[source]

Apply a hard-edged low-pass filter to an input vector.

INPUTS: vec – sequence – 1D vector, assumed to be evenly sampled bandwidth – integer – size of the filter passband in cycles per signal duration. f <= bandwidth is passed through; f > bandwidth is rejected. retfilter – bool – if True, return the tuple (filtered_vec, filter) Lopass-filtered version of vec Assumes the input is real-valued.
analysis.fixval(arr, repval, retarr=False)[source]

Fix non-finite values in a numpy array, and replace them with repval.

INPUT: arr – numpy array, with values to be replaced. repval – value to replace non-finite elements with retarr – if False, changes values in arr directly (more efficient). if True, returns a fixed copy of the input array, which is left unchanged. ```fixval(arr, -1) ```
analysis.fmin(func, x0, args=(), kw={}, xtol=0.0001, ftol=0.0001, maxiter=None, maxfun=None, full_output=0, disp=1, retall=0, callback=None, zdelt=0.00025, nonzdelt=0.05, holdfixed=None)[source]

Minimize a function using the downhill simplex algorithm – now with KEYWORDS.

Parameters: func : callable func(x,*args) The objective function to be minimized. x0 : ndarray Initial guess. args : tuple Extra arguments passed to func, i.e. f(x,*args). callback : callable Called after each iteration, as callback(xk), where xk is the current parameter vector. (xopt, {fopt, iter, funcalls, warnflag}) xopt : ndarray Parameter that minimizes function. fopt : float Value of function at minimum: fopt = func(xopt). iter : int Number of iterations performed. funcalls : int Number of function calls made. warnflag : int 1 : Maximum number of function evaluations made. 2 : Maximum number of iterations reached. allvecs : list Solution at each iteration.

Other Parameters:

xtol : float
Relative error in xopt acceptable for convergence.
ftol : number
Relative error in func(xopt) acceptable for convergence.
maxiter : int
Maximum number of iterations to perform.
maxfun : number
Maximum number of function evaluations to make [200*len(x0)]
full_output : bool
Set to True if fval and warnflag outputs are desired.
disp : bool
Set to True to print convergence messages.
retall : bool
Set to True to return list of solutions at each iteration.
zdelt : number
Set the initial stepsize for x0[k] equal to zero
nonzdelt : number
Set the initial stepsize for x0[k] nonzero
holdfixed : sequence
Indices of x0 to hold fixed (e.g., [1, 2, 4])
TBD: gprior : tuple or sequence of tuples Set a gaussian prior on the indicated parameter, such that chisq += ((x0[p] - val)/unc_val)**2, where the parameters are defined by the tuple gprior=(param, val, unc_val) Uses a Nelder-Mead simplex algorithm to find the minimum of function of one or more variables.
analysis.fmin_helper2(all_args)[source]

Allows me to wrap fmin() within pool.map() for multithreading.

EXAMPLE: :: from multiprocessing import Pool import analysis as an import phasecurves as pc pool = Pool(processes=nthreads) fits = pool.map(an.fmin_helper2, [[pc.errfunc, pos0[jj], mcargs, test_kws] for jj in xrange(0, nwalkers, int(nwalkers/12.))]) # The above line is equivalent to, but roughly ~(nthreads) # times faster, than the standard way to do it: fits2 = [an.fmin(pc.errfunc, pos0[jj], mcargs, **test_kw) for jj in xrange(0, nwalkers, int(nwalkers/10.))] This must be a separate, stand-alone function in order to be ‘pickleable’, which is required by pool.map().
analysis.fmin_powell(func, x0, args=(), kw={}, xtol=0.0001, ftol=0.0001, maxiter=None, maxfun=None, full_output=0, disp=1, retall=0, callback=None, direc=None, holdfixed=None)[source]

Minimize a function using modified Powell’s method – now with KEYWORDS.

Parameters: func : callable f(x,*args) Objective function to be minimized. x0 : ndarray Initial guess. args : tuple Eextra arguments passed to func. kw : dict Keyword arguments passed to func. callback : callable An optional user-supplied function, called after each iteration. Called as callback(xk), where xk is the current parameter vector. direc : ndarray Initial direction set. (xopt, {fopt, xi, direc, iter, funcalls, warnflag}, {allvecs}) xopt : ndarray Parameter which minimizes func. fopt : number Value of function at minimum: fopt = func(xopt). direc : ndarray Current direction set. iter : int Number of iterations. funcalls : int Number of function calls made. warnflag : int Integer warning flag: 1 : Maximum number of function evaluations. 2 : Maximum number of iterations. allvecs : list List of solutions at each iteration.

Other Parameters:

xtol : float
Line-search error tolerance.
ftol : float
Relative error in func(xopt) acceptable for convergence.
maxiter : int
Maximum number of iterations to perform.
maxfun : int
Maximum number of function evaluations to make.
full_output : bool
If True, fopt, xi, direc, iter, funcalls, and warnflag are returned.
disp : bool
If True, print convergence messages.
retall : bool
If True, return a list of the solution at each iteration.
Notes: Uses a modification of Powell’s method to find the minimum of a function of N variables.
analysis.gaussian(p, x)[source]

Compute a gaussian distribution at the points x.

p is a three- or four-component array, list, or tuple:

y = [p3 +] p0/(p1*sqrt(2pi)) * exp(-(x-p2)**2 / (2*p1**2))

p[0] – Area of the gaussian p[1] – one-sigma dispersion p[2] – central offset (mean location) p[3] – optional constant, vertical offset

NOTE: FWHM = 2*sqrt(2*ln(2)) * p1 ~ 2.3548*p1

analysis.gaussian2d(p, x, y)[source]

Compute a 2D gaussian distribution at the points x, y.

INPUTS: p : sequence a four- or five-component array, list, or tuple: z = [p4 +] p0/(2*pi*p1**2) * exp(-((x-p2)**2+(y-p3)) / (2*p1**2)) p[0] – Area of the gaussian p[1] – one-sigma dispersion p[2] – x-central offset p[3] – y-central offset p[4] – optional constant, vertical offset x : NumPy array X-coordinate values at which the above relation will be computed. y : NumPy array Y-coordinate values at which the above relation will be computed. gaussian() (1D)
analysis.gaussian2d_ellip(p, x, y)[source]

Compute a 2D elliptical gaussian distribution at the points x, y.

p is a 5-, 6-, or 7-component sequence, defined as:

p[0] – Amplitude (Area of the function)

p[1] – x-dispersion

p[2] – y-dispersion

p[3] – x-central offset

p[4] – y-central offset

p[5] – optional rotation angle (radians)

p[6] – optional constant, vertical offset

X, Y are gridded data from numpy.meshgrid()

First define:

x’ = (x - p[3]) cos p[5] - (y - p[4]) sin p[5]

y’ = (x - p[3]) sin p[5] + (y - p[4]) cos p[5]

Then calculate:

U = (x’ / p[1])**2 + (y’ / p[2])**2

z = p[6] + p0/(2*pi*p1*p2) * exp(-U / 2)

analysis.gaussiannd(mu, cov, x)[source]

Compute an N-dimensional gaussian distribution at the position x.

mu is the length-N 1D vector of mean positions

cov is the NxN covariance matrix of the multinormal distribution.

x are the positions at which to compute. If X is 2D (size M x
N), it is taken as M sets of N-point positions.

analysis.generic_mcmc(*arg, **kw)[source]

Run a Markov Chain Monte Carlo (Metropolis-Hastings algorithm) on an arbitrary function.

INPUTS: OPTIONAL_INPUTS: EITHER: func : function to generate model. First argument must be “params;” subsequent arguments are passed in via the “args” keyword params : 1D sequence parameters to be fit stepsize : 1D or 2D array If 1D: 1-sigma change in parameter per iteration If 2D: covariance matrix for parameter changes. z : 1D array Contains dependent data (to be modeled) sigma : 1D array Contains standard deviation (errors) of “z” data numit : int Number of iterations to perform OR: (allparams, (arg1, arg2, ...), numit) where allparams is a concatenated list of parameters for each of several functions, and the arg_i are tuples of (func_i, stepsize_i, z_i, sigma_i). In this case the keyword ‘args’ must also be a tuple of sequences, one for each function to be MCMC’ed. args : 1D sequence Second, third, etc.... arguments to “func” nstep : int Saves every “nth” step of the chain posdef : None, ‘all’, or sequences of indices. Which elements should be restricted to positive definite? If indices, it should be of the form (e.g.): [0, 1, 4] holdfixed : None, or sequences of indices. Which elements should be held fixed in the analysis? If indices, it should be of the form (e.g.): [0, 1, 4] jointpars : None, or sequence of 2-tuples. Only for simultaneous multi-function fitting. For each pair of values passed, we set the parameters values so: allparams[pair[1]] = allparams[pair[0]] allparams : 2D array Contains all parameters at each step bestp : 1D array Contains best paramters as determined by lowest Chi^2 numaccept: int Number of accepted steps chisq: 1D array Chi-squared value at each step Numerical Recipes, 3rd Edition (Section 15.8) Wikipedia If you need an efficient MCMC algorithm, you should be using http://danfm.ca/emcee/
analysis.getblocks(vec)[source]

Return start and end indices for consecutive sequences of integer-valued indices.

Example: ```import analysis as an vec = range(5) +range(10,14) + range(22,39) starts,ends = an.getblocks(vec) print zip(starts,ends) ```
analysis.getobj(*args, **kw)[source]

Get data for a specified planet.

INPUTS: OPTIONAL INPUTS: (str) – planet name, e.g. “HD 189733 b” datafile : str datafile name verbose : bool verbosity flag ```p = getobj('55cnce') p.period ---> 2.81705 ```

The attributes of the returned object are many and varied, and can be listed using the ‘dir’ command on the returned object.

This looks up data from the local datafile, which could be out of date.

analysis.getorbitalphase(planet, hjd, **kw)[source]

Get phase of an orbiting planet.

INPUT:
planet – a planet from getobj; e.g., getobj(‘55cnce’) hjd
OUTPUT:
orbital phase – from 0 to 1.
NOTES:

If planet.transit==True, phase is based on the transit time ephemeris. If planet.transit==False, phase is based on the RV ephemeris as

computed by function rveph

analysis.gfit(func, x0, fprime, args=(), kwargs={}, maxiter=2000, ftol=0.001, factor=1.0, disp=False, bounds=None)[source]

Perform gradient-based minimization of a user-specified function.

INPUTS: func : function Function that takes as input the parameters x0, optional additional arguments args, and optional keywords kwargs, and returns the metric to be minimized as a scalar. For chi-squared minimization, a generalized option is :phasecurves:`errfunc`. x0 : sequence List or 1D NumPy array of initial-guess parameters, to be adjusted to minimize func(x0, *args, **kwargs). fprime : function Function that takes as input the parameters x0, optional additional arguments args, and optional keywords kwargs, and returns the partial derivatives of the metric to be minimized with regard to each element of x0. args : list Optional arguments to func and fprime (see above) kwargs : dict Optional keywords to func and fprime (see above) maxiter : int Maximum number of iterations to run. ftol : scalar Desired tolerance on the metric to be minimized. Iteration will continue until either iter>maxiter OR (metric_i - metric_(i+1)) < ftol. factor : scalar Factor to scale gradient before applying each new iteration. Small values will lead to slower convergences; large values will lead to wild behavior. The code attempts to (crudely) tune the value of ‘factor’ depending on how the minimization process progresses. disp : bool If True, print some text to screen on each iteration. bounds : None, or list (min, max) pairs for each element in x0, defining the bounds on that parameter. Use None or +/-inf for one of min or max when there is no bound in that direction. (params, metric, n_iter) The program attempts to be slightly clever: if the metric decreases by
analysis.gpolyval(c, x, mode='poly', retp=False)[source]

Generic polynomial evaluator.

INPUT:
c (1D array) – coefficients of polynomial to evaluate,
from highest-order to lowest.

x (1D array) – pixel values at which to evaluate C

OPTINAL INPUTS:
MODE=’poly’ – ‘poly’ uses standard monomial coefficients
as accepted by, e.g., polyval. Other modes – ‘cheby’ (1st kind) and ‘legendre’ (P_n) – convert input ‘x’ to a normalized [-1,+1] domain

RETP=False – Return coefficients as well as evaluated poly.

OUTPUT:
y – array of shape X; evaluated polynomial. (y, p) (if retp=True)

analysis.im2(data1, data2, xlab='', ylab='', tit='', bar=False, newfig=True, cl=None, x=, []y=, []fontsize=16)[source]

Show two images; title will only be drawn for the top one.

Load ATRAN atmospheric transmission data file.

INPUT:
filename – filename of the ATRAN file. This should be an
ASCII array where the second column is wavelength and the third is the atmospheric transmission.

(This can also be a list of filenames!)

OPTIONAL INPUT: wl – if True (DEFAULT) also return the wavelength scale. This can take up to twice as long for large files.
RETURNS:

if wl==True: returns a 2D array, with columns [lambda, transmission] if wl==False: returns a 1D Numpy array of the transmission

NOTE: A list of these return values is created if
‘filename’ is actually an input list.

Load a set of reduced spectra into a single data file.

datalist = [‘file1.fits’, ‘file2.fits’] datapath = ‘~/data/’

The input can also be the name of an IRAF-style file list.

analysis.lorentzian2d(p, x, y)[source]

Compute a 2D Lorentzian distribution at the points x, y.

p is a 5-, 6-, or 7–component sequence:

z = (x-p3) ** 2 / p1 ** 2 + (y-p4) ** 2 / p2 ** 2 [ + (x-p3) * (y-p4) * p5 ]
lorentz = p0 / (1.0 + z) [ + p6]

p[0] – Amplitude (Area of the function) p[1] – x-dispersion p[2] – y-dispersion p[3] – x-central offset p[4] – y-central offset p[5] – optional ellipticitity parameter p[6] – optional constant, vertical offset

analysis.lsq(x, z, w=None, retcov=False, checkvals=True)[source]

Do weighted least-squares fitting.

INPUTS: x : sequence tuple of 1D vectors of equal lengths N, or the transposed numpy.vstack of this tuple z : sequence vector of length N; data to fit to. w : sequence Either an N-vector or NxN array of weights (e.g., 1./sigma_z**2) retcov : bool. If True, also return covariance matrix. checkvals : bool If True, check that all values are finite values. This is safer, but the array-based indexing slows down the function. the tuple of (coef, coeferrs, {cov_matrix})
analysis.lsqsp(x, z, w=None, retcov=False)[source]

Do weighted least-squares fitting on sparse matrices.

INPUTS: x : sparse matrix, shape N x M data used in the least-squares fitting, a set of N rows of M elements each. z : sequence (shape N) or sparse matrix (shape N x 1) data to fit to. w : sequence (shape N) or sparse matrix (shape N x N) Data weights and/or inverse covariances (e.g., 1./sigma_z**2) #retcov : bool. # If True, also return covariance matrix. the tuple of (coef, coeferrs, {cov_matrix}) lsq() SciPy’s sparse module.
analysis.meanr(x, nsigma=3, niter=inf, finite=True, verbose=False, axis=None)[source]

Return the mean of an array after removing outliers.

INPUTS: x – (array) data set to find mean of nsigma – (float) number of standard deviations for clipping niter – number of iterations. finite – if True, remove all non-finite elements (e.g. Inf, NaN) axis – (int) axis along which to compute the mean. ```from numpy import * from analysis import meanr x = concatenate((randn(200),[1000])) print mean(x), meanr(x, nsigma=3) x = concatenate((x,[nan,inf])) print mean(x), meanr(x, nsigma=3) ```
numpy.isfinite()
analysis.medianfilter(data, filtersize, threshold=None, verbose=False)[source]

For now, we assume FILTERSIZE is odd, and that DATA is square!

filt = medianfilter(data, filtersize)

Note that filtersize can be a scalar (e.g., 5) to equally median-filter along both axes, or a 2-vector (e.g., [5, 1]) to apply a rectangular median-filter.

This is about the slowest way to do it, but it was easy to write.

analysis.medianr(x, nsigma=3, niter=inf, finite=True, verbose=False, axis=None)[source]

Return the median of an array after removing outliers.

INPUTS: x – (array) data set to find median of nsigma – (float) number of standard deviations for clipping niter – number of iterations. finite – if True, remove all non-finite elements (e.g. Inf, NaN) axis – (int) axis along which to compute the mean. ```from numpy import * from analysis import medianr x = concatenate((randn(200),[1000])) print median(x), medianr(x, nsigma=3) x = concatenate((x,[nan,inf])) print median(x), medianr(x, nsigma=3) ```
numpy.isfinite()
analysis.mjd(date)[source]

Convert Julian Date to Modified Julian Date, or vice versa.

if date>=2400000.5, add 2400000.5 if date<2400000.5, subtract 2400000.5

analysis.nGaussianCen(p, x, mu)[source]

Compute the sum of N gaussian distributions at the points x. The distributions have central moments defined by the vector mu.

Useful for fitting to partially blended spectral data when you have good measurements of positions (i.e., from 2D tracing).

p is a sequence of length (2N+1). If N=2:

y = [p6 +] p0/(p1*sqrt(2pi)) * exp(-(x-mu1)**2 / (2*p1**2)) +
p3/(p4*sqrt(2pi)) * exp(-(x-mu2)**2 / (2*p4**2))

p[0] – Area of gaussian 1 p[1] – one-sigma dispersion of gaussian 1 p[2] – Area of gaussian 2 p[3] – one-sigma dispersion of gaussian 2

... etc.
p[-1] – optional constant, vertical offset
and

mu1 – central offset (mean location) of gaussian A mu2 – central offset (mean location) of gaussian B

NOTE: FWHM = 2*sqrt(2*ln(2)) * p1 ~ 2.3548*p1

analysis.neworder(N)[source]

Generate a random order the integers [0, N-1] inclusive.

Pads input matrix to size specified.

```out = pad(in, npix)
out = pad(in, npix_rows, npix_cols);  # alternate usage
```

Written by J. Green @ JPL; converted to Python by I. Crossfield

analysis.pb_helperfunction(inputs)[source]

Helper function for prayerbead(). Not for general use.

class analysis.planet(*args)[source]

Very handy planet object.

Best initialized using getobj().

REQUIREMENTS: Database file exoplanets.csv from http://exoplanets.org/
get_t14(*args)[source]

Compute total transit duration (in days) for a transiting planet.

Returns:

Using Eq. 14 of J. Winn’s chapter in S. Seager’s book “Exoplanets.”

get_t23(*args)[source]

Compute full transit duration (in days) for a transiting planet.

Returns:

Using Eq. 15 of J. Winn’s chapter in S. Seager’s book “Exoplanets.”

get_teq(ab, f, reterr=False)[source]

Compute equilibrium temperature.

INPUTS: ab : scalar, 0 <= ab <= 1 Bond albedo. f : scalar, 0.25 <= ab <= 2/3. Recirculation efficiency. A value of 0.25 indicates full redistribution of incident heat, while 2/3 indicates zero redistribution. ```import analysis planet = analysis.getobj('HD 189733 b') planet.get_teq(0.0, 0.25) # zero albedo, full recirculation ``` Seager, “Exoplanets,” 2010. Eq. 3.9
phase(hjd)[source]

Get phase of an orbiting planet.

refer to function analysis.getorbitalphase for full documentation.

rv(**kw)[source]

Compute radial velocity on a planet object for given Julian Date.

EXAMPLE: ```import analysis p = analysis.getobj('HD 189733 b') jd = 2400000. print p.rv(jd) ```

refer to function analysis.rv for full documentation.

rveph(jd)[source]

Compute the most recently elapsed RV emphemeris of a given planet at a given JD. RV ephemeris is defined by the having radial velocity equal to zero.

refer to analysis.rv() for full documentation.

writetext(filename, **kw)[source]
analysis.planettext(planets, filename, delimiter=', ', append=True)[source]

Write planet object info into a delimited line of text.

INPUTS: planets : planet object or list thereof filename : str delimiter : str
analysis.poly2cheby(cin)[source]

Convert straight monomial coefficients to chebychev coefficients.

INPUT: poly coefficients (e.g., for use w/polyval) OUTPUT: chebyt coefficients

analysis.polyfitr(x, y, N, s, fev=100, w=None, diag=False, clip='both', verbose=False, plotfit=False, plotall=False, eps=1e-13, catchLinAlgError=False)[source]

Matplotlib’s polyfit with weights and sigma-clipping rejection.

DESCRIPTION: Do a best fit polynomial of order N of y to x. Points whose fit residuals exeed s standard deviations are rejected and the fit is recalculated. Return value is a vector of polynomial coefficients [pk ... p1 p0]. w: a set of weights for the data; uses CARSMath’s weighted polynomial fitting routine instead of numpy’s standard polyfit. fev: number of function evaluations to call before stopping ‘diag’nostic flag: Return the tuple (p, chisq, n_iter) clip: ‘both’ – remove outliers +/- ‘s’ sigma from fit ‘above’ – remove outliers ‘s’ sigma above fit ‘below’ – remove outliers ‘s’ sigma below fit catchLinAlgError : bool If True, don’t bomb on LinAlgError; instead, return [0, 0, ... 0]. CARSMath Iterates so long as n_newrejections>0 AND n_iter

Generic function to perform Prayer-Bead (residual permutation) analysis.

INPUTS: OPTIONAL INPUTS: (fitparams, modelfunction, arg1, arg2, ... , data, weights) OR: (allparams, (args1, args2, ..), npars=(npar1, npar2, ...)) where allparams is an array concatenation of each functions input parameters. jointpars – list of 2-tuples. For use with multi-function calling (w/npars keyword). Setting jointpars=[(0,10), (0,20)] will always set params[10]=params[0] and params[20]=params[0]. parinfo – None, or list of dicts ‘parinfo’ to pass to the kapteyn.py kpmfit routine. gaussprior – list of 2-tuples, same length as “allparams.” The i^th tuple (x_i, s_i) imposes a Gaussian prior on the i^th parameter p_i by adding ((p_i - x_i)/s_i)^2 to the total chi-squared. axis – int or None If input is 2D, which axis to permute over. (NOT YET IMPLEMENTED!) step – int > 0 Stepsize for permutation steps. 1 by default. (NOT YET IMPLEMENTED!) verbose – bool Print various status lines to console. maxiter – int Maximum number of iterations for _each_ fitting step. maxfun – int Maximum number of function evaluations for _each_ fitting step. threads – int Number of threads to use (via multiprocessing.Pool) ```TBW ``` kapteyn, Planetary phase curve routines, numpy
analysis.removeoutliers(data, nsigma, remove='both', center='mean', niter=inf, retind=False, verbose=False)[source]

Strip outliers from a dataset, iterating until converged.

INPUT: OPTIONAL INPUTS: data – 1D numpy array. data from which to remove outliers. nsigma – positive number. limit defining outliers: number of standard deviations from center of data. remove – (‘min’|’max’|’both’) respectively removes outliers below, above, or on both sides of the limits set by nsigma. center – (‘mean’|’median’|value) – set central value, or method to compute it. niter – number of iterations before exit; defaults to Inf, which can occasionally result in empty arrays returned retind – (bool) whether to return index of good values as second part of a 2-tuple. ```from numpy import hist, linspace, randn from analysis import removeoutliers data = randn(1000) hbins = linspace(-5,5,50) d2 = removeoutliers(data, 1.5, niter=1) hist(data, hbins) hist(d2, hbins) ```
analysis.returnSections(time, dtmax=0.1)[source]

Return 2-tuples that are the indices of separate sections, as indicated by breaks in a continuous and always-increasing time series.

INPUTS: time : 1D NumPy array The time index of interest. Should be always increasing, such that numpy.diff(time) is always positive. dtmax : float Any break in ‘time’ equal to or larger than this indicates a new segment. ```import transit # Simulate a time series with 30-minute sampling: t1 = np.arange(0, 3.7, 0.5/24) t2 = np.arange(5, 70, 0.5/24) t3 = np.arange(70.2, 85, 0.5/24) days = np.concatenate((t1, t2, t3)) ret = transit.returnSections(days, dtmax=0.1) # If each segment was correctly identified, these print 'True': print (t1==days[ret[0][0]:ret[0][1]+1]).all() print (t2==days[ret[1][0]:ret[1][1]+1]).all() print (t3==days[ret[2][0]:ret[2][1]+1]).all() ```
analysis.rv(p, jd=None, e=None, reteanom=False, tol=1e-08)[source]

Compute unprojected astrocentric RV of a planet for a given JD in m/s.

INPUTS: p : planet object, or 5- or 6-sequence planet object: see get_obj() OR: sequence: [period, t_peri, ecc, a, long_peri, gamma] (if gamma is omitted, it is set to zero) (long_peri should be in radians!) jd : NumPy array Dates of observation (in same time system as t_peri). e : NumPy array Eccentric Anomaly of observations (can be precomputed to save time) ```jd = 2454693 # date: 2008/08/14 p = getobj('55 Cnc e') # planet: 55 Cnc e vp = rv(p, jd) ``` returns vp ~ 1.47e5 [m/s]

The result will need to be multiplied by the sine of the inclination angle (i.e., “sin i”). Positive radial velocities are directed _AWAY_ from the observer.

To compute the barycentric radial velocity of the host star, scale the returned values by the mass ratio -m/(m+M).

analysis.rveph(p, jd)[source]

Compute the most recently elapsed RV emphemeris of a given planet at a given JD. RV ephemeris is defined by the having radial velocity equal to zero.

EXAMPLE: ```from analysis import getobj, rveph jd = 2454693 # date: 2008/08/14 p = getobj('55cnce') # planet: 55 Cnc e t = rveph(p, jd) ``` returns t ~

analysis.rvstar(p, jd=None, e=None, reteanom=False, tol=1e-08)[source]

Compute radial velocity of a star which has an orbiting planet.

INPUTS: p : planet object, or 5- or 6-sequence planet object: see get_obj() OR: sequence: [period, t_peri, ecc, K, long_peri, gamma] (if gamma is omitted, it is set to zero) jd : NumPy array Dates of observation (in same time system as t_peri). e : NumPy array Eccentric Anomaly of observations (can be precomputed to save time) ```jd = 2454693 # date: 2008/08/14 p = getobj('55 Cnc e') # planet: 55 Cnc e vp = rv(p, jd) ``` Positive radial velocities are directed _AWAY_ from the observer.
analysis.scale_mcmc_stepsize(accept, func, params, stepsize, z, sigma, numit=1000, scales=[0.1, 0.3, 1.0, 3.0, 10.0], args=(), nstep=1, posdef=None, holdfixed=None, retall=False, jointpars=None)[source]

Run generic_mcmc() and scale the input stepsize to match the desired input acceptance rate.

INPUTS: mostly the same as for generic_mcmc(), but also with: accept : float between 0 and 1 desired acceptance rate; typically between 0.15 - 0.5. scales : sequence of floats test scaling parameters; measure for these, then interpolate to get ‘accept’ retall : bool if True, return tuple (scalefactor, acceptances, scales). Otherwise return only the scaler ‘scalefactor.’ pylab (for pylab.interp())
analysis.snr(data, axis=None, nsigma=None)[source]
Compute the quantity:
data.mean(axis=axis)/data.std(axis=axis)

for the specified data array/vector along the specified axis.

‘nsigma’ is used to reject outliers.

Output will be a scalar (axis is None) or numpy array, as appropriate.

analysis.spliner(x, y, k=3, sig=5, s=None, fev=100, w=None, clip='both', verbose=False, plotfit=False, plotall=False, diag=False)[source]

Matplotlib’s polyfit with sigma-clipping rejection.

Do a scipy.interpolate.UnivariateSpline of order k of y to x.

Points whose fit residuals exeed s standard deviations are rejected and the fit is recalculated. Returned is a spline object.

Iterates so long as n_newrejections>0 AND n_iter<fev.

OPTIONAL INPUTS:

err: a set of errors for the data fev: number of function evaluations to call before stopping ‘diag’nostic flag: Return the tuple (p, chisq, n_iter) clip: ‘both’ – remove outliers +/- ‘s’ sigma from fit

‘above’ – remove outliers ‘s’ sigma above fit ‘below’ – remove outliers ‘s’ sigma below fit

analysis.stdfilt(vec, wid=3)[source]

Compute the standard deviation in a sliding window.

INPUTS: vec : 1D sequence data to filter wid : int, odd width of filter; ideally odd (not even).
analysis.stdfilt2d(data, filtersize, threshold=None, verbose=False)[source]

For now, we assume FILTERSIZE is odd, and that DATA is square!

filt = stdfilt2d(data, filtersize)

This is about the slowest way to do it, but it was easy to write.

analysis.stdr(x, nsigma=3, niter=inf, finite=True, verbose=False, axis=None)[source]

Return the standard deviation of an array after removing outliers.

INPUTS: x – (array) data set to find std of nsigma – (float) number of standard deviations for clipping niter – number of iterations. finite – if True, remove all non-finite elements (e.g. Inf, NaN) axis – (int) axis along which to compute the mean. ```from numpy import * from analysis import stdr x = concatenate((randn(200),[1000])) print std(x), stdr(x, nsigma=3) x = concatenate((x,[nan,inf])) print std(x), stdr(x, nsigma=3) ```
numpy.isfinite()
analysis.stdres(data, bins=None, oversamp=None, dataindex=None)[source]

Compute the standard deviation in the residuals of a data series after average-binning by specified amounts.

INPUTS: data - 1D numpy array Data to analyze. bins - sequence Factors by which to bin down. If None, use 1:sqrt(data.size) and set dataindex=None. oversamp - int Number of times to shift, resample, and bin the data. Large values take longer, but give a less “noisy” estimate (which can be a problem at large bin sizes) dataindex - 1D numpy array Values across which data are indexed (i.e., ‘time’). If not None, bins apply to dataindex rather than to data and should be increasing intervals of dataindex (rather than the number of points to bin down by). ```import numpy as np import analysis as an import pylab as py npts = 1e4 t = np.arange(npts) data = np.random.normal(size=npts) binfactors = np.arange(1, npts/2.+1) bindown_result = an.stdres(data, binfactors, dataindex=t, oversamp=1) py.figure() py.subplot(211) py.plot(t, data, 'k') py.xlabel('Time') py.ylabel('Data value') py.minorticks_on() py.title('Bin-down test: Gaussian Noise') py.subplot(212) py.loglog(binfactors, bindown_result, '-b', linewidth=2) py.loglog(binfactors, data.std()/np.sqrt(binfactors), '--r') py.xlabel('Binning factor') py.ylabel('RMS of binned data') py.legend(['Binned RMS', '1/sqrt(N)']) ```
analysis.test_eccentric_anomaly(ecc, manom, tol=1e-08)[source]

Test various methods of computing the eccentric anomaly.

ecc = scalar; manom = 1D NumPy array

Just run, e.g.:
```ecc = 0.15
p_orb = 3.3
mean_anom = 2*pi*linspace(0, p_orb, 10000)/p_orb
an.test_eccentric_anomaly(ecc, mean_anom, tol=1e-10)
```
analysis.total_least_squares(data1, data2, data1err=None, data2err=None, print_results=False, ignore_nans=True, intercept=True, return_error=False, inf=10000000000.0)[source]

Use Singular Value Decomposition to determine the Total Least Squares linear fit to the data. (e.g. http://en.wikipedia.org/wiki/Total_least_squares) data1 - x array data2 - y array

if intercept:
returns m,b in the equation y = m x + b
else:
returns m

print tells you some information about what fraction of the variance is accounted for

ignore_nans will remove NAN values from BOTH arrays before computing

data1,data2 : np.ndarray
Vectors of the same length indicating the ‘x’ and ‘y’ vectors to fit
data1err,data2err : np.ndarray or None
Vectors of the same length as data1,data2 holding the 1-sigma error values
analysis.travisplanet(p)[source]

Generate a line of text for Travis Barman’s planet table.

INPUT: a planet object from getobj().

analysis.trueanomaly(ecc, eanom=None, manom=None)[source]

Calculate (Keplerian, orbital) true anomaly.

One optional input must be given.

INPUT: OPTIONAL_INPUTS: ecc – scalar. orbital eccentricity. eanom – scalar or Numpy array. Eccentric anomaly. See eccentricanomaly() manom – scalar or sequence. Mean anomaly, equal to 2*pi*(t - t0)/period
analysis.wmean(a, w, axis=None)[source]

Perform a weighted mean along the specified axis.

INPUTS: a : sequence or Numpy array data for which weighted mean is computed w : sequence or Numpy array weights of data – e.g., 1./sigma^2 reterr : bool If True, return the tuple (mean, err_on_mean), where err_on_mean is the unbiased estimator of the sample standard deviation. wstd()
analysis.wmeanfilt(vec, wid=3, w=None)[source]

Compute the (weighted) mean in a sliding window.

INPUTS: vec : 1D sequence data to filter wid : int, odd width of filter; ideally odd (not even).
analysis.wstd(a, w, axis=None)[source]

Perform a weighted standard deviation along the specified axis. If axis=None, then the weighted standard deviation of the entire array is computed.

Note that this computes the _sample_ standard deviation; Numpy/Scipy computes the _population_ standard deviation, which is greater by a factor sqrt(N/N-1). This effect is small for large datasets.

analysis.xcorr2_qwik(img0, img1)[source]

Perform quick 2D cross-correlation between two images.

Images must be the same size.

Computed via squashing the images along each dimension and computing 1D cross-correlations.

Previous topic

Phase curve spectroscopic analysis, and associated subroutines.

Next topic

Plotting and analysis tools