Documentation for hightea-plotting

Supporting collection of classes and functions for the HighTEA project, that aims to facilitate data visualisation in high-energy physics precision studies.

Plotting procedures

hightea.plotting.plotting.plot(*runs, **kwargs)

General plotting routine

Parameters:
  • *runs – Objects that are instances of Run class, or can be cast into one (filepaths as strings, dictionaries), passed as positional arguments.

  • **kwargs

    Optional keyword arguments

    • showbool, default: True

      Show plot after being done.

    • ratioint

      Add ratio plot w.r.t run under provided id.

    • figsizetuple

      Specify size of figure in the same way as matplotlib.

    • logscalebool

      Plot on logscale over axes of interest. Possible values: ‘x’, ‘y’, ‘xy’.

    • limdict

      Specify limits on axes, e.g {‘y2’: [0.8,1.2]}.

    • latexbool, default: False

      Use latex fonts on plots (warning: slow).

    • titlestr

      Specify plot title.

    • show_setupbool

      Show information on central setup.

    • outputstr

      Save to png/pdf file if specified.

    • figurematplotlib.pyplot.Figure

      If passed, use as canvas, otherwise create new.

Examples

>>> # Import library
>>> import hightea.plotting as ht
>>> # Plot data stored in files
>>> ht.plot('HEPDATA-experiment.csv', 'request.json')
>>> # Construct and plot random runs
>>> run = ht.Run.random([10], nsetups=3)
>>> run2 = run.minicopy()
>>> run2 += 1
>>> ht.plot(run, run2, ratio=0, figsize=(16,9), title='Random runs')
>>> # Example of plotting on top of the data
>>> import matplotlib.pyplot as plt
>>> fig = plt.figure()
>>> runs = ...
>>> ht.plot(*runs, figure=fig, show=False)
>>> ax = fig.get_axes()[0]
>>> X = np.array([(l+r)/2 for l,r in zip(run.edges[0][:-1], run.edges[0][1:])])
>>> Y = (X**2+2*80**2)**-2
>>> ax.scatter(X, Y, label='fit')
>>> ax.legend()
>>> plt.show()

Note

See supporting procedures to see how other keyword arguments are used.

Returns:

Figure which used in plotting to allow further manipulations.

Return type:

matplotlib.pyplot.Figure

hightea.plotting.plotting.plot_unrolled(ax, *runs, **kwargs)

Procedure to plot provided runs as 1D unrolled histograms

Parameters:
  • *runs – Runs passed as positional arguments.

  • **kwargs

    • gridbool, default: True

      If True, show grid on plot.

    • colorschemelist

      List of colors to use for provided runs.

    • legendbool

      If True, show legend.

    • finetunedict

      Passes parameters (values) directly to plotting functions (keys). Passes on to underlying local plotting routines. Affects: Axes.grid.

Examples

>>> # Example of a more involved plot using underlying routines
>>> hightea.plotting as ht
>>> matplotlib.pyplot as plt
>>> run = ht.Run.random([10], nsetups=3)
>>> runs = dict(lo=run, data=run+0.5, nnlo=(run+0.6).update_info('NNLO absolute'))
>>> fig = plt.figure()
>>> ax = fig.add_subplot(2,1,1)
>>> ht.plotting.plot_unrolled(ax, runs['nnlo'])
>>> ax = fig.add_subplot(2,1,2)
>>> ht.plotting.plot_unrolled(ax, *[
>>>         (runs['lo']/runs['data'].v())[0].update_info('LO (central scale)'),
>>>         (runs['data']/runs['data'].v()).update_info(name='NLO', experiment=True),
>>>         (runs['nnlo']/runs['nlo'].v()).update_info('NNLO'),
>>>     ],
>>>    finetune=dict(grid={"alpha":.3}),
>>> )
>>> ax.set_ylabel('Ratio to data')
>>> plt.show()

Note

See supporting procedures to see how other keyword arguments are used.

Return type:

None

Run class

class hightea.plotting.run.Run(file=None, bins=None, edges=None, nhist=0, nsetups=1, **kwargs)

Container for observable results and metadata

Designed to be a universal container for histogram and differential distribution results from hightea, HEPDATA, and other sources. Stores values and statistical errors as simple numpy arrays, with two dimensions, corresponding to the list of bins, and the calculation setups.

values

Values for histogram or differential distribution across all bins and scale setups. Has (X,Y) shape, where X is the number of bins, and Y is the number of scale setups.

Type:

numpy.ndarray

errors

Statistical errors corresponding to values. Same shape as values.

Type:

numpy.ndarray

__init__(file=None, bins=None, edges=None, nhist=0, nsetups=1, **kwargs)

Run class constructor.

Creates and returns and instance of Run class based on provided information. To parse a file or a constructed object, use file parameters, otherwise, specify either bins or edges.

Parameters:
  • file (optional) – Path to file or an instance of an object. Several formats supported: JSON, CSV, dictionary.

  • bins (optional) – Bins in the format of 3D-list: [… [ [l1,r1], [l2,r2], …], …]

  • edges (optional) – Bin edges in the format of 2D-list: [ … , [x0, x1 … ], … ]

Returns:

Instance of Run class.

Return type:

Run

Examples

>>> run = Run('req.json')
>>> run = Run('experiment.csv')
>>> run = Run(dict(...))
>>> run = Run(edges=[[0,1,2,3,4,5],[-1,-0.5,0,0.5,1]])
v()

Get values at central scale

Returns:

Array of central scale values for each bin correspondingly.

Return type:

numpy.ndarray

e()

Get errors at central scale

Returns:

Array of central scale errors for each bin correspondingly.

Return type:

numpy.ndarray

upper()

Get upper values for scale variation

Returns:

Array of upper scale band values for each bin correspondingly.

Return type:

numpy.ndarray

lower()

Get lower values for scale variation

Returns:

Array of upper scale band values for each bin correspondingly.

Return type:

numpy.ndarray

dim()

Get dimension of the run

Returns:

Number of dimensions in data.

Return type:

int

dimensions()

Get dimensions for each axis

Returns:

List of bin numbers for each dimension.

Return type:

list

nsetups()

Get number of setups in run

Returns:

Number of different scale setups in the run.

Return type:

int

label(name)

Provide name for the current run

Parameters:

name (string) – New run name

Return type:

Self

update_info(info=None, **kwargs)

Update run information

Parameters:
  • info (optional) –

    • Run: copy metadata from passed run

    • str: set the name of the run

    • dict: update metadata with dictionary

  • **kwargs – Arbitrary information will be passed on to run.info dictionary.

Return type:

Self

property info

Retrieve run metadata

Return type:

dict

property name

Retrieve run name

Return type:

str

property bins

Retrieve run bins

Return type:

list

property edges

Retrieve run edges

Return type:

list

static bin_area(bins)

Calculate area for a multidimensional histogram bin.

Return type:

float

loading_methods()

Convenient loading methods to load data into Run class

load(request, nhist=0, **kwargs)

Load data to Run.

Parameters:
  • request (dict) – Python dictionary in the JSON format as returned by hightea server.

  • **kwargs – Any modifications to the metadata dictionary.

Return type:

None

is_differential()

Check if run set to be a differential distribution

Looks into self.info[‘differential’] to see how whether run is set to be a histogram or differential distribution. By default, runs are treated as histograms.

Return type:

bool

make_histogramlike(ignorechecks=False)

Turn differential distribution to histogram

Checks data is already set to be a histogram, and performs normalisation over bin area if not.

Returns:

Modifies and returns self.

Return type:

Run

make_differential()

Turn histograms into differential distributions

Checks data is already set to be a differential distribution, and performs normalisation over bin area if not.

Returns:

Modifies and returns self.

Return type:

Run

abs()

Return run with absolute values

Creates a deep copy of self and modifies run name.

Return type:

Run

has_OUF()

Check if contains over- and underflow bins.

Return type:

bool

remove_OUF(inplace=False)

Remove over- and underflow bins

Returns run without over- and underflow bins given self.

Parameters:

inplace (optional) – If True, modify in-place and return self.

Return type:

Run

zoom(value=None, line=None, dim=0)

Get run with lower dimensional slice of the data.

Specify the bin by some value that it contains or directly by the line number. The metadata is passed on as is, with modified observable name.

Parameters:
  • value (optional) – The slice will contain provided value.

  • line (optional) – The slice will be taken at this bin number.

  • dim (optional) – Dimension at which to take the slice.

Return type:

Run

transpose()
mergebins(values=None, pos=None)

Merge bins by values or positions

Specify the values or positions for bins to be merged into one. The metadata is passed on as is. Only 1-dim runs are supported.

Parameters:
  • value (optional) – List with 2 values [l,r] is expected. bins [a,b] which satisfy l <= a, b < r will be merged into one bin.

  • pos (optional) – List with 2 values [l,r] is expected. bins with id: l <= id < r (inclusively) will be merged into one bin.

Return type:

Run

minicopy(copyinfo=False)

Minimal copy of run

Only data.

Parameters:

copyinfo (optional) – If True, will include metadata.

Return type:

Run

deepcopy()

Full (deep) copy of run

Return type:

Run

flatten()

Remove dimensions represented by single bins

Return type:

Run

to_htdict()

Get dictionary in hightea format from run

Returns:

Dictionary in hightea format.

Return type:

dict

to_json(file, combined=False, verbose=True)

Dump run to JSON file in hightea format

Parameters:
  • file (str) – Output file.

  • combined (bool, default: False) – If true, will print all setups into one file. Otherwise, will print each setup separately into different files.

  • verbose (bool, default: True) – If true, will print all setups into one file. Otherwise, will print each setup separately into different files.

Return type:

None

to_csv(file, header=None, **kwargs)

Dump run to CSV file in HEPDATA format

Parameters:
  • file (str) – Output file.

  • **kwargs

    • header: specify header for csv file

    • all_values: print not only central value, upper and lower band values, but across all setups

    • logx: if true, bin centers are geometric mean, otherwise simple average

Return type:

None

static convert_to_edges(binsList)

Get edges for each dimension given a list of bins

Parameters:

binsList (list) – Bins in the format of 3D-list: [… [ [l1,r1], [l2,r2], …], …]

Returns:

2-dimensional list.

Return type:

list

static convert_to_bins(edgesList)

Get full list of bins given edges for each dimension

Parameters:

edgesList (list) – Bin edges in the format of 2D-list: [ … , [x0, x1 … ], … ]

Returns:

3-dimensional list.

Return type:

list

static full(dims, nsetups=1, fill_value=0)

Get run with filled const values

Parameters:
  • nsetups (int, default: 1) – Number of scale setups.

  • fill_value (float, default: 0) – Constant value to fill into histograms.

Return type:

Run

static seq(dims, nsetups=1)

Get multidimensional run for testing

Values are taken from the natural sequence.

Parameters:

nsetups (int, default: 1) – Number of scale setups.

Return type:

Run

static random(dims, nsetups=1, seed=None)

Get multidimensional run for testing

Values are generated randomly.

Parameters:
  • nsetups (optional) – Number of scale setups.

  • seed (optional) – Set seed in numpy.random.

Return type:

Run

Extra loaders

hightea.plotting.stripper.convert_to_Run(mt: MeasurementTools, file=0, **kwargs)

Convert instance of MeasurementTools to Run

Parameters:
  • mt (MeasurementTools) –

  • file (int) – Specify file id.

  • **kwargs

    • verbose : bool, default: 0

    • obs : int, default: 0

    • hist : int, default: 0

    • setups : list, default: <all>

    • withOUF : list, default: False

Return type:

Run

hightea.plotting.stripper.load_to_Run(xmlfile, **kwargs)

Load from stripper xml file to Run

Parameters:

xmlfile (str) – XML file produced by Stripper.

Note

All keyword arguments are passed to convert_to_Run.

Return type:

Run

Indices and tables