In this part of the manual:
The Fityk mini-language consists of commands.
Basically, there is one command per line. If for some reason it is more comfortable to place more than one command in one line, they can be separated with a semicolon (;).
Most of the commands can have arguments separated by a comma (,), e.g. delete %a, %b, %c.
Most of the commands can be shortened: e.g. you can type inf or in or i instead of info. See Appendix B. Grammar for details.
The symbol ‘#’ starts a comment - everything from the hash (#) to the end of the line is ignored.
Data files are read using the xylib library.
Points are loaded from files using the command:
dataslot < filename[:xcol:ycol:scol:block] [filetype options...]
where
If the filename contains blank characters, a semicolon or comma, it should be put inside single quotation marks (together with colon-separated indices, if any).
Multiple y columns and/or blocks can be specified, see the examples below:
@0 < foo.vms
@0 < foo.fii text first-line-header
@0 < foo.dat:1:4:: # x,y - 1st and 4th columns
@0 < foo.dat:1:3,4:: # load two dataset (with y in columns 3,4)
@0 < foo.dat:1:3..5:: # load three dataset (with y in columns 3,4,5)
@0 < foo.dat:1:4..6,2:: # load four dataset (y: 4,5,6,2)
@0 < foo.dat:1:2..:: # load 2nd and all the next columns as y
@0 < foo.dat:1:2:3: # read std. dev. of y from 3rd column
@0 < foo.dat:0:1:: # x - 0,1,2,..., y - first column
@0 < 'foo.dat:0:1::' # the same
@0 < foo.raw::::0,1 # load two first blocks of data (as one dataset)
Information about loaded data can be obtained with:
info data [in @0]
The xylib library can read TSV or CSV formats (tab or comma separated values). In fact, the values can be separated by any whitespace character or by one of ,;: punctations, or by any combination of these.
Empty lines and comments that start with hash (#) are skipped.
Since there is a lot of files in the world that contain numeric data mixed with text, unless the strict option is given any text that can not be interpreted as a number is regarded a start of comment (the rest of the line is ignored).
Note that the file is parsed regardless of blocks and columns specified by the user. The data read from the file are first stored in a table with m columns and n rows. If some of the lines have 3 numbers in it, and some have 5 numbers, we can either discard the lines that have 3 numbers or we can discard the numbers in 4th and 5th column. Usually the latter is done, unless it seems that the shorter lines should be ignored. The line is ignored:
Note
Xylib doesn’t handle well nan’s and inf’s in the data. This will be improved in the future.
Data blocks and columns may have names. These names are used to set a title of the dataset (see Working with multiple datasets for details). If the option first-line-header is given and the number of words in the first line is equal to the number of data columns, each word is used as a name of corresponding column. If the number of words is different, the first line is used as a name of the block. If the last-line-header option is given, the line preceding the first data line is used to set either column names or the block name.
If the file starts with the “LAMMPS (” string, the last-line-header option is set implicitely. This is very helpful when plotting data from LAMMPS log files.
We often have the situation that only a part of the data from a file is of interest. In Fityk, each point is either active or inactive. Inactive points are excluded from fitting and all calculations. A data transformation:
A = boolean-condition
can be used to change the state of points.
In the GUI, there is a Data-Range Mode that allows to activate and disactivate points with mouse.
When fitting data, we assume that only the y coordinate is subject to
statistical errors in measurement. This is a common assumption.
To see how the y standard deviation  influences fitting
(optimization), look at the weighted sum of squared residuals formula
in Nonlinear optimization.
We can also think about weights of points – every point has a weight
assigned, that is equal
 influences fitting
(optimization), look at the weighted sum of squared residuals formula
in Nonlinear optimization.
We can also think about weights of points – every point has a weight
assigned, that is equal  .
.
Standard deviation of points can be read from file together with the x and y coordinates. Otherwise, it is set either to max(sqrt(y), 1.0) or to 1, depending on the value of data-default-sigma option. Setting std. dev. as a square root of the value is common and has theoretical ground when y is the number of independent events. You can always change standard deviation, e.g. make it equal for every point with command: S=1. See Data point transformations for details.
Note
It is often the case that user is not sure what standard deviation should be assumed, but it is her responsibility to pick something.
Every data point has four properties: x coordinate, y coordinate, standard deviation of y and active/inactive flag. Lower case letters x, y, s, a stand for these properties before transformation, and upper case X, Y, S, A for the same properties after transformation. M stands for the number of points.
Data can be transformed using assignments. Command Y=-y will change the sign of the y coordinate of every point.
You can apply transformation to selected points: Y[3]=1.2 will change point with index 3 (which is 4th point, because first has index 0), and Y[3..6]=1.2 will do the same for points with indices 3, 4, 5, but not 6. Y[2..]=1.2 will apply the transformation to points with index 2 and above. You can guess what Y[..6]=1.2 does.
Most of operations are executed sequentially for points from the first to the last one. n stands for the index of currently transformed point. The sequance of commands:
M=500; x=n/100; y=sin(x)
will generate the sinusoid dataset with 500 points.
If you have more than one dataset, you have to specify explicitly which dataset transformation applies to. See Working with multiple datasets for details.
Note
Points are kept sorted according to their x coordinate, so changing x coordinate of points will also change the order and indices of points.
Expressions can contain:

The value of a data expression can be shown using the command info, see examples at the end of this section. The precision of printed numbers is governed by the option info-numeric-format.
Linear interpolation of y (or any other property: s,a,X,Y,S,A) between two points can be calculated using special syntax:
y[x=expression]
If the given x is outside of the current data range, the value of the first/last point is returned.
Note
All operations are performed on real numbers.
Two numbers that differ less than epsilon (see option epsilon) i.e. abs(a-b)<epsilon, are considered equal.
Indices are also computed in real number domain, and then rounded to the nearest integer.
Transformations can be joined with comma (,), e.g.
X=y, Y=x
swaps axes.
Before and after executing transformations, points are always sorted according to their x coordinate. You can temporarily change the order of points using order=t, where t is one of x, y, s, a, -x, -y, -s, -a. This only makes sense for a sequence of transformations (joined with comma), as after finishing each transformation points will be reordered again. This feature is rarely useful.
Points can be deleted using the following syntax:
delete[index-or-range]
or
delete(condition)
and created simply by increasing the value of M.
There are also aggregate functions:
min (the smallest value),
max (the largest value),
sum (sum of all values),
avg (arithmetic mean of all values),
stddev (standard deviation of all values),
and can be used to normalize the area. darea is implemented as t*(x[n+1]-x[n-1])/2, where t is the value of the expression).
They have two forms:
aggregatefunc(expression)
aggregatefunc(expression if condition)
In the first form the value of expression is calculated for all points. In the second, only the points for which the condition is true are taken into account.
True value in data expression is represented numerically by 1., and false by 0, so sum can be also used to count points that fulfil given criteria.
A few examples:
# integrate
Y[1...] = Y[n-1] + y[n]
# delete inactive points
delete(not a)
# reduce twice the number of points, averaging x and adding y
x[...-1] = (x[n]+x[n+1])/2
y[...-1] = y[n]+y[n+1]
delete(n%2==1)
# change x scale of diffraction pattern (2theta -> Q)
X = 4*pi * sin(x/2*pi/180) / 1.54051
# make equal step, keep the number of points the same
X = x[0] + n * (x[M-1]-x[0]) / (M-1),  Y = y[x=X], S = s[x=X], A = a[x=X]
# take the first 2000 points, average them and subtract as background
Y = y - avg(y if n<2000)
# Fityk can be used as a simple calculator
i 2+2 #4
i sin(pi/4)+cos(pi/4) #1.41421
i gamma(10) #362880
# normalize data area
Y = y / darea(y)
# calculations that use aggregate functions
i max(y) # the largest y value
i max(y if a) # the largest y value in the active range
i sum(y>100) # count the points that have y greater than 100
i sum(y>avg(y)) # count the points that have y greater than the arithmetic mean
i darea(y-F(x) if 20<x and x<25) # example of more complex syntax
You may postpone reading this section and read about the Model first.
Variables ($foo) and functions (%bar) can be used in data transformations, e.g.:
Y = y / $foo  # divides all y's by $foo
Y = y - %f(x) # subtracts function %f from data
Y = y - @0.F(x) # subtracts all functions in F
# Fit constant x-correction (e.g. a shift in the scale of the instrument
# collecting data), correct the data and remove the correction from the model.
Z = Constant(~0)
fit
X = x + @0.Z(x) # data transformation is here
Z = 0
In the Baseline Mode in the GUI, functions Spline() and Polyline() are used to substract background, that have been manually marked by the user. Clicking Strip background results in a commands like this:
%bg0 = Spline(14.2979,62.1253, 39.5695,35.0676, 148.553,49.9493)
Y = y - %bg0(x)
Note
The GUI uses functions named %bgX, where X is the index of the dataset, and the type of the function is either Spline or Polyline, to handle the baseline. This allows user to set the function manually (or in a script) and then edit the baseline in the Baseline Mode.
Values of the function parameters (e.g. %fun.a0) and pseudo-parameters Center, Height, FWHM and Area (e.g. %fun.Area) can also be used. Pseudo-parameters are supported only by functions, which know how to calculate these properties.
It is also possible to calculate some properties of %functions:
A few examples:
info numarea(%fun, 0, 100, 10000) # shows area of function %fun
info %_1(extremum(%_1, 40, 50)) # shows extremum value
# calculate FWHM numerically, value 50 can be tuned
$c = {%f.Center}
i findx(%f, $c, $c+50, %f.Height/2) - findx(%f, $c, $c-50, %f.Height/2)
i %f.FWHM # should give almost the same.
Let us call a set of data that usually comes from one file – a dataset. All operations described above assume only one dataset. If there are more datasets created, it must be explicitly stated which dataset the command is being applied to, e.g. M=500 in @0. Datasets have numbers and are referenced by ‘@’ with the number, e.g. @3. @* means all datasets (e.g. Y=y/10 in @*).
To load dataset from file, use one of commands:
@n < filename:xcol:ycol:scol:block filetype options...
@+ < filename:xcol:ycol:scol:block filetype options...
The first one uses existing data slot and the second one creates a new slot. Using @+ increases the number of datasets, and command delete @n decreases it.
The dataset can be duplicate (@+ = @n) or transformed, more on this in the next section.
Each dataset has a separate model, that can be fitted to the data. This is explained in the next chapter.
Each dataset also has a title (it does not have to be unique, however). When loading file, a title is automatically created:
Titles can be changed using the command:
set @n.title=new-title
To print the title of the dataset, type info title in @n.
You calculate values of a data expression for each dataset and print a list of results, e.g. i+ avg(y) in @*.
There is also another kind of transformations, dataset tranformation, which operate on a whole dataset, not single points:
@n = dataset-transformation @m
or more generally:
@n = dataset-transformation @m + @k + ...
where dataset-transformation can be one of:
A sum of datasets (@n + @m + ...) contains all points from all component datasets. If datasets have the same x values, the sum of y values can be obtained using @+ = sum_same_x @n + @m + ....
Examples:
@+ = @0 # duplicate the dataset
@+ = @0 + @1 # create a new dataset from @0 and @1
@0 = rm_shirley_bg @0 # remove Shirley background
Command:
info dataslot (expression , ...) > file.tsv
can export data to an ASCII TSV (tab separated values) file.
To export data in a 3-column (x, y and standard deviation) format, use:
info @0 (x, y, s) > file.tsv
If a is not listed in the list of columns, like in the example above, only the active points are exported.
All expressions that can be used on the right-hand side of data transformations can also be used in the column list. Additionally, F and Z can be used with dataset prefix, e.g.
info @0 (n+1, x, y, F(x), y-F(x), Z(x), %foo(x), a, sin(pi*x)+y^2) > file.tsv
The option info-numeric-format can be used to change the format and precision of all numbers.
The model F (the function that is fitted to the data) is computed
as a sum of component functions,  .
Each component function is one of the functions defined in the program,
such as Gaussian or polynomial.
.
Each component function is one of the functions defined in the program,
such as Gaussian or polynomial.
To avoid confusion we will always use:
Function  is a function of x,
and depends on a vector of parameters
 is a function of x,
and depends on a vector of parameters  .
This vector contains all fitted parameters.
.
This vector contains all fitted parameters.
Because we often have the situation, that the error in the x coordinate
of data points can be modeled with function  ,
we introduce this term to the model, and the final formula is:
,
we introduce this term to the model, and the final formula is:

where 
Note that the same x-correction Z
is used in all functions  .
.
Now we will have a closer look at component functions.
Every function  has a type chosen from the function types
available in the program. The same is true about functions
 has a type chosen from the function types
available in the program. The same is true about functions  .
One of these types is the Gaussian. It has the following formula:
.
One of these types is the Gaussian. It has the following formula:
![f_G(x; a_0, a_1, a_2)=a_{0}\exp\left[-\ln(2)\left(\frac{x-a_{1}}{a_{2}}\right)^{2}\right]](_images/math/c104a975652392e8192a26551dc776a5fbdcf9c7.png)
There are three parameters of Gaussian. These parameters do not depend on x. There must be one variable bound to each function’s parameter.
Variables in Fityk have names prefixed with the dollar symbol ($). A variable is created by assigning a value to it, e.g.
$foo=~5.3
$c=3.1
$bar=5*sin($foo)
The variables like the first one, $foo, created by assigning to it a real number prefixed with ‘~’, will be called simple-variables. The ‘~’ means that the value assigned to the variable can be changed when fitting the model to the data.
Each simple-variable is independent. In optimization terms, it corresponds to one dimension of the space where we will look for the minimum.
In the above example, the variable $c is actually a constant. $bar depends on the value of $foo. When $foo changes, the value of $bar also changes. Variables like $bar will be called compound-variables. Compound-variables can be build using operators +, -, *, /, ^ and the functions sqrt, exp, log10, ln, sin, cos, tan, sinh, cosh, tanh, atan, asin, acos, erf, erfc, lgamma, abs, voigt. This is a subset of the functions used in data transformations.
The value of the data expression can be used in the variable definition. The expression must be in braces, e.g. $bleh={3+5}. The simple variable can be created by preceding the left brace with the tilde ($bleh=~{3+5}). A few examples:
$foo = {y[0]}
$foo2 = {y[0] in @0}  # dataset can be given if necessary
$foo3 = {min(y if a) in @0}
Sometimes it is useful to freeze a variable, i.e. to prevent it from changing while fitting. There is no special syntax for it, but it can be done using data expressions in this way:
$a = ~12.3 # $a is fittable
$a = {$a}  # $a is not fittable
$a = ~{$a}  # $a is fittable again
It is also possible to define a variable as e.g. $bleh=~9.1*exp(~2). In this case two simple-variables (with values 9.1 and 2) are created automatically.
Automatically created variables are named $_1, $_2, $_3, and so on.
Variables can be deleted using the command:
delete $variable
Some fitting algorithms need to randomize the parameters of the fitted function (i.e. they need to randomize simple variables). For this purpose, the simple variable can have a specified domain. Note that the domain does not imply any constraints on the value the variable can have – it is only a hint for fitting algorithms. Domains are used by Nelder-Mead method and Genetic Algorithms. The syntax is as follows:
$a = ~12.3 [11 +- 5] # center and width of the domain are given
$b = ~12.3 [ +- 5] # if the center of the domain is not specified,
                   # the value of the variable is used
If the domain is not specified, the value of variable-domain-percent option is used (domain is +/- value-of-variable * variable-domain-percent / 100)
Let us go back to functions. Function types have names that start with upper case letter, e.g. Linear or Voigt. Functions (i.e. function instances) have names prefixed with a percent symbol, e.g. %func. Every function has a type and variables bound to its parameters.
info types shows the list of available function types. info FunctionType (e.g. info Pearson7) shows formula of the FunctionType.
Functions can be created by giving the type and the correct number of variables in brackets, e.g.
%f1 = Gaussian(~66254., ~24.7, ~0.264)
%f2 = Gaussian(~6e4, $ctr, $b+$c)
%f3 = Gaussian(height=~66254., hwhm=~0.264, center=~24.7)
Every expression which is valid on the right-hand side of a variable assignment can be used as a variable. If it is not just a name of a variable, an automatic variable is created. In the above examples, two variables were implicitely created for %f2: first for value 6e4 and the second for $b+$c).
If the names of function’s parameters are given (like for %f3), the variables can be given in any order.
Function types can can have specified default values for some parameters. The variables for such parameters can be omitted, e.g.:
=-> i Pearson7
Pearson7(height, center, hwhm, shape=2) = height/(1+((x-center)/hwhm)^2*(2^(1/shape)-1))^shape
=-> %f4 = Pearson7(height=~66254., center=~24.7, fwhm=~0.264) # no shape is given
New function %f4 was created.
A deep copy of function (i.e. all variables that it depends on are also copied) can be made using the command:
%function = copy(%another_function)
Functions can be also created with the command guess, as described in Guessing peak location.
You can change a variable bound to any of the function parameters in this manner:
=-> %f = Pearson7(height=~66254., center=~24.7, fwhm=~0.264)
New function %f was created.
=-> %f.center=~24.8
=-> $h = ~66254
=-> %f.height=$h
=-> info %f
%f = Pearson7($h, $_5, $_3, $_4)
=-> $h = ~60000 # variables are kept by name, so this also changes %f
=-> %p1.center = %p2.center + 3 # keep fixed distance between %p1 and %p2
Functions can be deleted using the command:
delete %function
Variadic function types have variable number of parameters. Two variadic function types are defined:
Spline(x1, y1, x2, y2, ...)
Polyline(x1, y1, x2, y2, ...)
For example %f:
%f = Spline(22.1, 37.9, 48.1, 17.2, 93.0, 20.7)
is the cubic spline interpolation through points (22.1, 37.9), (48.1, 17.2), ....
The Polyline function is similar, but gives the polyline interpolation.
Both Spline and Polyline functions are primarily used for the manual baseline subtraction via the GUI.
User-defined function types can be created using command define, and then used in the same way as built-in functions.
Example:
define MyGaussian(height, center, hwhm) = height*exp(-ln(2)*((x-center)/hwhm)^2)
The name of new type must start with an upper-case letter, contain only letters and digits and have at least two characters.
The name of the type is followed by parameters in brackets.
Parameter name must start with lowercase letter and, contain only lowercase letters, digit and the underscore (‘_’).
The name “x” is reserved, do not put it into parameter list, just use it on the right-hand side of the definition.
There are special names of parameters, that Fityk understands:
Parameters with such names do not need default values. fwhm mean full width at half maximum (FWHM), hwhm means half width..., i.e. fwhm/2.
Each parameter should have a default value (see examples below). Default values allow adding a peak with the command guess or with one click in the GUI.
The default value can be a number or expression that contains the special names listed above with exeption of hwhm (use fwhm/2 instead).
UDFs can be defined in a few ways:
When giving a full formula, right-hand side of the equality sign is similar to the definiton of variable, but the formula can also depend on x. Hopefully the examples at the end of this section make the syntax clear.
How it works internally
The formula is parsed, derivatives of the formula are calculated symbolically, all expressions are simplified (but there is a lot of space for optimization here) and bytecode for virtual machine (VM) is created.
When fitting, the VM calculates the value of the function and derivatives for every point.
Possible (i.e. not implemented) optimizations include Common Subexpression Elimination and JIT compilation.
There is a simple substitution mechanism that makes writing complicated functions easier. Substitutions must be assigned in the same line, after keyword where. Example:
define ReadShockley(sigma0=1, a=1) = sigma0 * t * (a - ln(t)) where t=x*pi/180
# more complicated example, with nested substitutions
define FullGBE(k, alpha) = k * alpha * eta * (eta / tanh(eta) - ln (2*sinh(eta))) where eta = 2*pi/alpha * sin(theta/2), theta=x*pi/180
Tip
Use the init file for often used definitions. See Invoking fityk for details.
Defined functions can be undefined using command undefine.
Examples:
# this is how some built-in functions could be defined
define MyGaussian(height, center, hwhm) = height*exp(-ln(2)*((x-center)/hwhm)^2)
define MyLorentzian(height, center, hwhm) = height/(1+((x-center)/hwhm)^2)
define MyCubic(a0=height,a1=0, a2=0, a3=0) = a0 + a1*x + a2*x^2 + a3*x^3
# supersonic beam arrival time distribution
define SuBeArTiDi(c, s, v0, dv) = c*(s/x)^3*exp(-(((s/x)-v0)/dv)^2)/x
# area-based Gaussian can be defined as modification of built-in Gaussian
# (it is the same as built-in GaussianA function)
define GaussianArea(area, center, hwhm) = Gaussian(area/hwhm/sqrt(pi/ln(2)), center, hwhm)
# sum of Gaussian and Lorentzian, a.k.a. PseudoVoigt (should be in one line)
define GLSum(height, center, hwhm, shape) = Gaussian(height*(1-shape), center, hwhm)
+ Lorentzian(height*shape, center, hwhm)
# split-Gaussian, the same as built-in SplitGaussian (should be in one line)
define SplitG(height, center, hwhm1=fwhm*0.5, hwhm2=fwhm*0.5) =
  x < center ? Lorentzian(height, center, hwhm1)
             : Lorentzian(height, center, hwhm2)
# to change definition of UDF, first undefine previous definition
undefine GaussianArea
With default settings, the value of every function is calculated at every point. Functions such as Gaussian often have non-neglegible values only in a small fraction of all points. To speed up the calculation, set the option cut-function-level to a non-zero value. For each function the range with values greater than cut-function-level will be estimated, and all values outside of this range are considered to be equal zero. Note that not all functions support this optimization.
If you have a number of loaded dataset, and the functions in different datasets do not share parameters, it is faster to fit the datasets sequentially (fit in @0; fit in @1; ...) then parallelly (fit in @*).
Each simple-variable slows down the fitting, although this is often negligible.
As already discussed, each dataset has a separate model
that can be fitted to the data.
As can be seen from the formula above,
the model is defined as a set functions  and a set of functions
and a set of functions  .
These sets are named F and Z respectively.
The model is constructed by specifying names of functions in these two sets.
.
These sets are named F and Z respectively.
The model is constructed by specifying names of functions in these two sets.
In many cases x-correction Z is not used. The fitted curve is thus the sum of all functions in F.
Command
F += %function
adds %function to F, command
Z += %function
adds %function to Z.
To remove %function from F (or Z) either do:
F -= %function
or delete %function.
If there is more than one dataset, F and Z must be prefixed with the dataset number (e.g. @1.F += %function).
The following syntax is also valid:
# create and add funtion to F
%g = Gaussian(height=~66254., hwhm=~0.264, center=~24.7)
@0.F += %g
# create automatically named function and add it to F
@0.F += Gaussian(height=~66254., hwhm=~0.264, center=~24.7)
# clear F
@0.F = 0
# clear F and put three functions in it
@0.F = %a + %b + %c
# show info about the first and the last function in @0.F
info @0.F[0], @0.F[-1]
# the same as %bcp = copy(%b)
%bcp = copy(@0.F[1])
# make @1.F the exact (shallow) copy of @0.F
@1.F = @0.F
# make @1.F a deep copy of @0.F (all functions and variables
# are duplicated).
@1.F = copy(@0.F)
It is often required to keep the width or shape of peaks constant for all peaks in the dataset. To change the variables bound to parameters with a given name for all functions in F, use the command:
F.param = variable
Examples:
# Set hwhm of all functions in F that have a parameter hwhm to $foo
# (hwhm here means half-width-at-half-maximum)
F.hwhm = $foo
# Bound the variable used for the shape of peak %_1 to shapes of all
# functions in F
F.shape = %_1.shape
# Create a new simple-variable for each function in F and bound the
# variable to parameter hwhm. All hwhm parameters will be independent.
F.hwhm = ~0.2
It is possible to guess peak location and add it to F with the command:
[%name =] guess PeakType [[x1:x2]] [initial values...] [in @n]
e.g.
%f1 = guess Gaussian [22.1:30.5] in @0
# the same, but assign function's name automatically
guess Gaussian [22.1:30.5] in @0
# the same, but search for the peak in the whole dataset
guess Gaussian in @0
# the same, but works only if there is exactly one dataset loaded
guess Gaussian
guess Linear in @* # adds a function to every dataset
# guess width and height, but set center and shape explicitely
guess PseudoVoigt [22.1:30.5] center=$ctr, shape=~0.3 in @0
Fityk offers only a primitive algorithm for peak-detection. It looks for the highest point in a given range, and than tries to find the width of the peak.
If the highest point is found near the boundary of the given range, it is very probable that it is not the peak top, and, if the option can-cancel-guess is set to true, the guess is cancelled.
There are two real-number options related to guess: height-correction and width-correction. The default value for them is 1. The guessed height and width are multiplied by the values of these options respectively.
Linear function is guessed using linear regression. It is actually fitted (but weights of points are not used), not guessed.
If you are using the GUI, most of the available information can be displayed with mouse clicks. Alternatively, you can use the info command. Using info+ instead of info sometimes gives more verbose output.
Below is the list of arguments of info related to this chapter. The full list is in info: show information
The model can be exported to file as data points, using the syntax described in Exporting data, or as mathematical formula, using the info command redirected to a file:
info[+] formula in @n > filename
The style of the formula output, governed by the formula-export-style option, can be either normal (exp(-x^2)) or gnuplot (exp(-x**2)).
The list of parameters of functions can be exported using the command:
info[+] peaks in @n > filename
With @* formulae or parameters used in all datasets are written.
This is the core. We have a set of observations (data points), to which we want to fit a model that depends on adjustable parameters. Let me quote Numerical Recipes, chapter 15.0, page 656 (if you do not know the book, visit http://www.nr.com):
The basic approach in all cases is usually the same: You choose or design a figure-of-merit function (merit function, for short) that measures the agreement between the data and the model with a particular choice of parameters. The merit function is conventionally arranged so that small values represent close agreement. The parameters of the model are then adjusted to achieve a minimum in the merit function, yielding best-fit parameters. The adjustment process is thus a problem in minimization in many dimensions. [...] however, there exist special, more efficient, methods that are specific to modeling, and we will discuss these in this chapter. There are important issues that go beyond the mere finding of best-fit parameters. Data are generally not exact. They are subject to measurement errors (called noise in the context of signal-processing). Thus, typical data never exactly fit the model that is being used, even when that model is correct. We need the means to assess whether or not the model is appropriate, that is, we need to test the goodness-of-fit against some useful statistical standard. We usually also need to know the accuracy with which parameters are determined by the data set. In other words, we need to know the likely errors of the best-fit parameters. Finally, it is not uncommon in fitting data to discover that the merit function is not unimodal, with a single minimum. In some cases, we may be interested in global rather than local questions. Not, “how good is this fit?” but rather, “how sure am I that there is not a very much better fit in some corner of parameter space?”
Our function of merit is WSSR - the weighted sum of squared residuals, also called chi-square:
![\chi^{2}(\mathbf{a})
  =\sum_{i=1}^{N} \left[\frac{y_i-y(x_i;\mathbf{a})}{\sigma_i}\right]^{2}
  =\sum_{i=1}^{N} w_{i}\left[y_{i}-y(x_{i};\mathbf{a})\right]^{2}](_images/math/a1f49d0ef1ed40eddc5de7055dc1378db1e4a95c.png)
Weights are based on standard deviations,  .
You can learn why squares of residuals are minimized e.g. from
chapter 15.1 of Numerical Recipes.
.
You can learn why squares of residuals are minimized e.g. from
chapter 15.1 of Numerical Recipes.
So we are looking for a global minimum of  .
This field of numerical research (looking for a minimum or maximum)
is usually called optimization; it is non-linear and global optimization.
Fityk implements three very different optimization methods.
All are well-known and described in many standard textbooks.
.
This field of numerical research (looking for a minimum or maximum)
is usually called optimization; it is non-linear and global optimization.
Fityk implements three very different optimization methods.
All are well-known and described in many standard textbooks.
The standard deviations of the best-fit parameters are given by the square root of the corresponding diagonal elements of the covariance matrix. The covariance matrix is based on standard deviations of data points. Formulae can be found e.g. in GSL Manual, chapter Linear regression. Overview (weighted data version).
From the book J. Wolberg, Data Analysis Using the Method of Least Squares: Extracting the Most Information from Experiments, Springer, 2006, p.50:
(...) we turn to the task of determining the uncertainties associated with the‘s. The usual measures of uncertainty are standard deviation (i.e.,
or variance (i.e.,
) so we seek an expression that allows us to estimate the
‘s. It can be shown (...) that the following expression gives us an unbiased estimate of
:

Note that  is a square root of the value above.
In this formula n-p, the number of (active) data points minus the number
of independent parameters, is equal to the number of degrees of freedom.
S is another symbol for
 is a square root of the value above.
In this formula n-p, the number of (active) data points minus the number
of independent parameters, is equal to the number of degrees of freedom.
S is another symbol for  (the latter symbol is used e.g. in
Numerical Recipes).
 (the latter symbol is used e.g. in
Numerical Recipes).
Terms of the C matrix are given as (p. 47 in the same book):

 above is often called a standard error.
Having standard errors, it is easy to calculate confidence intervals.
Now another book will be cited: H. Motulsky and A. Christopoulos,
Fitting Models to Biological Data Using Linear and Nonlinear Regression:
A Practical Guide to Curve Fitting, Oxford University Press, 2004.
This book can be downloaded for free as a manual to GraphPad Prism 4.
 above is often called a standard error.
Having standard errors, it is easy to calculate confidence intervals.
Now another book will be cited: H. Motulsky and A. Christopoulos,
Fitting Models to Biological Data Using Linear and Nonlinear Regression:
A Practical Guide to Curve Fitting, Oxford University Press, 2004.
This book can be downloaded for free as a manual to GraphPad Prism 4.
The standard errors reported by most nonlinear regression programs (...) are “approximate” or “asymptotic”. Accordingly, the confidence intervals computed using these errors should also be considered approximate.
It would be a mistake to assume that the “95% confidence intervals” reported by nonlinear regression have exactly a 95% chance of enclosing the true parameter values. The chance that the true value of the parameter is within the reported confidence interval may not be exactly 95%. Even so, the asymptotic confidence intervals will give you a good sense of how precisely you have determined the value of the parameter.
The calculations only work if nonlinear regression has converged on a sensible fit. If the regression converged on a false minimum, then the sum-of-squares as well as the parameter values will be wrong, so the reported standard error and confidence intervals won’t be helpful.
The book describes also more accurate ways to calculate confidence intervals, such use Monte Carlo simulations.
In Fityk:
 .
. .
.Note
In Fityk 0.9.0 and earlier info errors reported values of
 , which makes sense if the standard
deviations of y‘s are set accurately. This formula is derived
in Numerical Recipes.
, which makes sense if the standard
deviations of y‘s are set accurately. This formula is derived
in Numerical Recipes.
This is a standard nonlinear least-squares routine, and involves
computing the first derivatives of functions.  For a description
of the L-M method see Numerical Recipes, chapter 15.5
or Siegmund Brandt, Data Analysis, chapter 10.15.
Essentially, it combines an inverse-Hessian method with a steepest
descent method by introducing a  factor. When
 factor. When  is equal
to 0, the method is equivalent to the inverse-Hessian method.
When
 is equal
to 0, the method is equivalent to the inverse-Hessian method.
When  increases, the shift vector is rotated toward the direction
of steepest descent and the length of the shift vector decreases. (The
shift vector is a vector that is added to the parameter vector.) If a
better fit is found on iteration,
 increases, the shift vector is rotated toward the direction
of steepest descent and the length of the shift vector decreases. (The
shift vector is a vector that is added to the parameter vector.) If a
better fit is found on iteration,  is decreased – it is divided by
the value of lm-lambda-down-factor option (default: 10).
Otherwise,
 is decreased – it is divided by
the value of lm-lambda-down-factor option (default: 10).
Otherwise,  is multiplied by the value of
lm-lambda-up-factor (default: 10).
The initial
 is multiplied by the value of
lm-lambda-up-factor (default: 10).
The initial  value is equal to
lm-lambda-start (default: 0.0001).
 value is equal to
lm-lambda-start (default: 0.0001).
The Marquardt method has two stopping criteria other than the common criteria.
 is greater than the value of the lm-max-lambda
option (default: 10^15), usually when due to limited numerical precision
WSSR is no longer changing, the fitting is also stopped.
 is greater than the value of the lm-max-lambda
option (default: 10^15), usually when due to limited numerical precision
WSSR is no longer changing, the fitting is also stopped.To quote chapter 4.8.3, p. 86 of Peter Gans, Data Fitting in the Chemical Sciences by the Method of Least Squares:
A simplex is a geometrical entity that has n+1 vertices corresponding to variations in n parameters. For two parameters the simplex is a triangle, for three parameters the simplex is a tetrahedron and so forth. The value of the objective function is calculated at each of the vertices. An iteration consists of the following process. Locate the vertex with the highest value of the objective function and replace this vertex by one lying on the line between it and the centroid of the other vertices. Four possible replacements can be considered, which I call contraction, short reflection, reflection and expansion.[...] It starts with an arbitrary simplex. Neither the shape nor position of this are critically important, except insofar as it may determine which one of a set of multiple minima will be reached. The simplex than expands and contracts as required in order to locate a valley if one exists. Then the size and shape of the simplex is adjusted so that progress may be made towards the minimum. Note particularly that if a pair of parameters are highly correlated, both will be simultaneously adjusted in about the correct proportion, as the shape of the simplex is adapted to the local contours.[...] Unfortunately it does not provide estimates of the parameter errors, etc. It is therefore to be recommended as a method for obtaining initial parameter estimates that can be used in the standard least squares method.
This method is also described in previously mentioned Numerical Recipes (chapter 10.4) and Data Analysis (chapter 10.8).
There are a few options for tuning this method. One of these is a stopping criterium nm-convergence. If the value of the expression 2(M-m)/(M*+*m), where M and m are the values of the worst and best vertices respectively (values of objective functions of vertices, to be precise!), is smaller then the value of nm-convergence option, fitting is stopped. In other words, fitting is stopped if all vertices are almost at the same level.
The remaining options are related to initialization of the simplex. Before starting iterations, we have to choose a set of points in space of the parameters, called vertices. Unless the option nm-move-all is set, one of these points will be the current point – values that parameters have at this moment. All but this one are drawn as follows: each parameter of each vertex is drawn separately. It is drawn from a distribution that has its center in the center of the domain of the parameter, and a width proportional to both width of the domain and value of the nm-move-factor parameter. Distribution shape can be set using the option nm-distribution as one of: uniform, gaussian, lorentzian and bound. The last one causes the value of the parameter to be either the greatest or smallest value in the domain of the parameter – one of the two bounds of the domain (assuming that nm-move-factor is equal 1).
[TODO]
Note
This chapter is not about GUI settings (things like colors, fonts, etc.), but about settings that are common for both CLI and GUI version.
Command info set shows the syntax of the set command and lists all possible options.
set option shows the current value of the option.
set option = value changes the option.
It is possible to change the value of the option temporarily using syntax:
with option1=value1 [,option2=value2]  command args...
The examples at the end of this chapter should clarify this.
Examples:
set fitting-method  # show info
set fitting-method = Nelder-Mead-simplex # change default method
set verbosity = verbose
with fitting-method = Levenberg-Marquardt fit 10
with fitting-method=Levenberg-Marquardt, verbosity=only-warnings fit 10
In the GUI version there is hardly ever a need to use this command directly.
The command plot controls visualization of data and the model. It is used to plot a given area - in GUI it is plotted in the program’s main window, in CLI the popular program gnuplot is used, if available.
plot xrange yrange in @n
xrange and yrange have one of two following syntaxes:
The second is just a dot (.), and it implies that the appropriate range is not to be changed.
Examples:
plot [20.4:50] [10:20] # show x from 20.4 to 50 and y from 10 to 20
plot [20.4:] # x from 20.4 to the end,
# y range will be adjusted to encompass all data
plot . [:10] # x range will not be changed, y from the lowest point to 10
plot [:] [:] # all data will be shown
plot         # all data will be shown
plot . .     # nothing changes
The value of the option autoplot changes the automatic plotting behaviour. By default, the plot is refreshed automatically after changing the data or the model. It is also possible to visualize each iteration of the fitting method by replotting the peaks after every iteration.
First, there is an option verbosity (not related to command info) which sets the amount of messages displayed when executing commands.
If you are using the GUI, most information can be displayed with mouse clicks. Alternatively, you can use the info command. Using the info+ instead of info sometimes displays more detailed information.
The output of info can be redirected to a file using syntax:
info args > filename    # this truncates the file
info args >> filename   # this appends to the file
The following info arguments are recognized:
info der shows derivatives of given function:
=-> info der sin(a) + 3*exp(b/a)
f(a, b) = sin(a)+3*exp(b/a)
df / d a = cos(a)-3*exp(b/a)*b/a^2
df / d b = 3*exp(b/a)/a
All commands given during program execution are stored in memory. They can be listed by:
info commands [n:m]
or written to file:
info commands [n:m] > filename
To put all commands executed so far during the session into the file foo.fit, type:
info commands[:] > foo.fit
With the plus sign (+) (i.e. info+ commands [n:m]) information about the exit status of each command will be added.
To log commands to a file when they are executed, use: Commands can be logged when they are executed:
commands > filename    # log commands
commands+ > filename   # log both commands and output
commands > /dev/null   # stop logging
Scripts can be executed using the command:
commands < filename
You can select lines that are to be executed:
commands < filename[m:n] # this executes lines from m to n
It is also possible to execute standard output from an external program:
commands ! program [args...]
The command:
dump > filename
writes the current state of the program (including all datasets) to a single .fit file.
The command sleep sec makes the program wait sec seconds before continuing.
The command quit works as expected. If it is found in a script it quits the program, not only the script.
Commands that start with ! are passed (without ‘!’) to the system() call.