esmraldi.spectraprocessing

Module for the preprocessing of spectra specifically designed for MALDI images

  • Peak picking

  • Local realignment procedures

  • Deisotoping

Module Contents

Functions

spectra_sum(spectra)

Computes the spectrum from the sum of all spectra.

spectra_mean(spectra)

Computes the average spectrum.

spectra_mean_centroided(spectra[, mzs])

spectra_max(spectra)

Computes the maximum intensity for each abscissa.

spectra_min(spectra)

Computes the minimum intensity for each abscissa.

spectra_peak_indices(spectra[, prominence, wlen])

Estimates and extracts significant peaks in the spectra

spectra_peak_indices_adaptative(spectra[, factor, wlen])

Estimates and extracts significant peaks in the spectra

spectra_peak_mzs_adaptative(spectra[, factor, wlen])

Estimates and extracts significant peaks in the spectra

spectra_peak_indices_adaptative_noiselevel(spectra[, ...])

Estimates and extracts significant peaks in the spectra

spectra_peak_mzs_adaptative_noiselevel(spectra[, ...])

Estimates and extracts significant peaks in the spectra

peak_indices(data, prominence, wlen[, distance])

Estimates and extracts significant peaks

peak_indices_cwt(data, factor, widths)

Peak indices using continuous wavelet

spectra_peak_mzs_cwt(spectra, factor, widths)

Peak detection using the Continuous Wavelet Transform.

same_mz_axis(spectra[, tol])

Generates spectra with common m/z values.

tic_values(spectra)

normalization_tic(spectra[, inplace])

TIC (total ion count) normalization.

normalization_sic(spectra, indices_peaks[, width_peak])

SIC (selective ion count) normalization.

index_groups(indices[, step, is_ppm])

Makes groups of indices.

index_groups_start_end(indices[, step, is_ppm])

peak_reference_indices_group(group)

Extracts the reference peak in a group,

peak_reference_indices_groups(groups)

Extracts the reference peaks for several groups.

peak_reference_indices_median(groups)

Extracts the reference peak in a group

width_peak_mzs(aligned_mzs, groups[, default])

Computes the width of a peak

width_peak_indices(indices, full_indices)

Computes the width of a peak

closest_peak(num, indices_to_width)

Extracts the closest peak of index num.

min_step(mzs, max_len[, starting_step, incr_step])

realign_reducing(out_spectra, spectra[, step, is_ppm])

realign_mean_spectrum(mzs, intensities, all_mzs[, ...])

realign_tree(spectra, mzs, mean_spectra[, step, is_ppm])

realign_wrt_peaks_mzs_generic(spectra, aligned_mzs)

realign_wrt_peaks_mzs(spectra, aligned_mzs, full_mzs, ...)

Realign spectra to reference peaks.

realign_wrt_peaks(spectra, aligned_peaks, full_peaks, ...)

Realign spectra to reference peaks

realign_indices(spectra, indices[, reference, ...])

Alignment function.

realign_mzs(spectra, mzs[, reference, nb_occurrence, ...])

Alignment function.

realign_generic(spectra, peaks[, step, is_ppm])

neighbours(index, n, spectra)

Right-sided neighbours of a point in a spectrum.

forward_derivatives(peaks)

Forward derivatives from peak value distribution.

find_isotopic_pattern(neighbours, tolerance, nb_charges)

Extracts isotopic pattern based on mz similarity

find_isotopic_pattern_theoretical_difference(...[, is_ppm])

peaks_max_intensity_isotopic_pattern(pattern)

Finds the peak with maximum intensity in

peaks_derivative_isotopic_pattern(pattern)

Finds the peak where the sign of the derivative changes

isotopes_from_pattern(pattern, peaks_in_pattern)

Find isotopes from a pattern.

mz_second_isotope_most_abundant(average_distribution)

Determines where the second isotope becomes the most abundant

peak_to_index(peak, pattern)

Gets the index of a peak in a pattern

deisotoping_simple(spectra[, tolerance, ...])

Simple deisotoping depending on the mass of the

deisotoping_simple_reference(spectra[, th_diff, ...])

Simple deisotoping depending on the mass of the

deisotoping_reference_indices(peaks[, th_diff, ...])

subtract_spectra(target, source)

extract_mean_spectra_coordinates(spectra, coordinates, ...)

esmraldi.spectraprocessing.spectra_sum(spectra)

Computes the spectrum from the sum of all spectra.

Parameters
spectra: np.ndarray

Spectra as [mz*I] array

Returns
np.ndarray

Sum of spectra

esmraldi.spectraprocessing.spectra_mean(spectra)

Computes the average spectrum.

Parameters
spectra: np.ndarray

Spectra as [mz*I] array

Returns
np.ndarray

Mean spectrum

esmraldi.spectraprocessing.spectra_mean_centroided(spectra, mzs=None)
esmraldi.spectraprocessing.spectra_max(spectra)

Computes the maximum intensity for each abscissa.

Parameters
spectra: np.ndarray

Spectra as [mz*I] array

Returns
np.ndarray

Max spectrum

esmraldi.spectraprocessing.spectra_min(spectra)

Computes the minimum intensity for each abscissa.

Parameters
spectra: np.ndarray

Spectra as [mz*I] array

Returns
np.ndarray

Max spectrum

esmraldi.spectraprocessing.spectra_peak_indices(spectra, prominence=50, wlen=10)

Estimates and extracts significant peaks in the spectra by using the prominence (height of the peak relative to the nearest higher peak).

Parameters
spectra: np.ndarray

Spectra as [mz*I] array

prominence: int

threshold on prominence

Returns
np.ndarray

Peak indices relative to spectra

esmraldi.spectraprocessing.spectra_peak_indices_adaptative(spectra, factor=1, wlen=10)

Estimates and extracts significant peaks in the spectra by using the local prominence (height of the peak relative to the background noise).

Background noise is estimated as the standard deviation of the signal over a window of size wlen.

Parameters
spectra: np.ndarray

Spectra as [mz*I] array

factor: float

prominence factor

wlen: int

size of the window

Returns
np.ndarray

Peak indices relative to spectra

esmraldi.spectraprocessing.spectra_peak_mzs_adaptative(spectra, factor=1, wlen=10)

Estimates and extracts significant peaks in the spectra by using the local prominence (height of the peak relative to the background noise).

Background noise is estimated as the standard deviation of the signal over a window of size wlen.

Parameters
spectra: np.ndarray

Spectra as [mz*I] array

factor: float

prominence factor

wlen: int

size of the window

Returns
np.ndarray

Peaks m/z

esmraldi.spectraprocessing.spectra_peak_indices_adaptative_noiselevel(spectra, factor=1, noise_level=1, wlen=10)

Estimates and extracts significant peaks in the spectra with specified noise level(s), by using the local prominence (height of the peak relative to the background noise)

Parameters
spectra: np.ndarray

Spectra as [mz*I] array

factor: float

prominence factor

noise_level: float or list

noise level

wlen: int

size of the window

Returns
np.ndarray

Peaks m/z

esmraldi.spectraprocessing.spectra_peak_mzs_adaptative_noiselevel(spectra, factor=1, noise_level=1, wlen=10)

Estimates and extracts significant peaks in the spectra with specified noise level(s), by using the local prominence (height of the peak relative to the background noise)

Parameters
spectra: np.ndarray

Spectra as [mz*I] array

factor: float

prominence factor

noise_level: float or list

noise level

wlen: int

size of the window

Returns
np.ndarray

Peaks m/z

esmraldi.spectraprocessing.peak_indices(data, prominence, wlen, distance=1)

Estimates and extracts significant peaks in the spectrum, by using the prominence (height of the peak relative to the nearest higher peak).

Parameters
spectra: np.ndarray

Spectra as [mz*I] array

prominence: int

threshold on prominence

Returns
np.ndarray

Peak indices relative to spectrum

esmraldi.spectraprocessing.peak_indices_cwt(data, factor, widths)

Peak indices using continuous wavelet transform.

Parameters
data: np.ndarray

Spectra as [mz*I] array

factor: float

Threshold SNR

widths: list

scales

Returns
np.ndarray

Peak indices relative to spectrum

esmraldi.spectraprocessing.spectra_peak_mzs_cwt(spectra, factor, widths)

Peak detection using the Continuous Wavelet Transform.

Parameters
spectra: np.ndarray

Spectra as [mz*I] array

factor: float

CWT threshold

widths: list

Wavelet widths

Returns
np.ndarray

Detected peak m/z ratios

esmraldi.spectraprocessing.same_mz_axis(spectra, tol=0)

Generates spectra with common m/z values.

Missing intensity values are added as np.nan.

Parameters
spectra: np.ndarray

Spectra as [mz*I] array

tol: float

Tolerance to consider when two species are the same

Returns
np.ndarray

Spectra as [mz*I] array

esmraldi.spectraprocessing.tic_values(spectra)
esmraldi.spectraprocessing.normalization_tic(spectra, inplace=False)

TIC (total ion count) normalization.

Divides each intensity in a spectrum by the sum of all its intensities.

Parameters
spectra: np.ndarray

spectra as [mz*I] array

Returns
np.ndarray

normalized spectrum

esmraldi.spectraprocessing.normalization_sic(spectra, indices_peaks, width_peak=10)

SIC (selective ion count) normalization.

Defined as : TIC - sum of peaks of high intensities Peaks are given with indices_peaks.

Parameters
spectra: np.ndarray

spectra as [mz*I] array

indices_peaks: np.ndarray

indices peaks

width_peak: int

average width of peaks

Returns
np.ndarray

normalized spectrum

esmraldi.spectraprocessing.index_groups(indices, step=1, is_ppm=False)

Makes groups of indices.

For realignment and spatial selection.

Parameters
indices: list

list of peak indices

step: int

threshold in indices to create groups

Returns
list

groups=list of list of peak indices

esmraldi.spectraprocessing.index_groups_start_end(indices, step=1, is_ppm=False)
esmraldi.spectraprocessing.peak_reference_indices_group(group)

Extracts the reference peak in a group, i.e. the most frequent in a group.

Parameters
group: list

list of peak indices

esmraldi.spectraprocessing.peak_reference_indices_groups(groups)

Extracts the reference peaks for several groups.

Parameters
groups: list

groups=list of list of peak indices

Returns
list

list of reference peak indices

esmraldi.spectraprocessing.peak_reference_indices_median(groups)

Extracts the reference peak in a group as the median peak.

Parameters
groups: list

list of peak indices

Returns
list

list of reference peak indices

esmraldi.spectraprocessing.width_peak_mzs(aligned_mzs, groups, default=0.001)

Computes the width of a peak by computing the difference in m/z between the upper and lower bounds in the group.

Parameters
aligned_mzs: list

list of m/z

groups: list

list of peak indices

default: float

default width of the peak

Returns
dict

maps m/z to widths

esmraldi.spectraprocessing.width_peak_indices(indices, full_indices)

Computes the width of a peak by checking neighboring indices.

Parameters
indices: np.ndarray

peak indices on the histogram of peaks

full_indices: np.ndarray

peak indices on the spectra

Returns
dict:

peak indices to corresponding width

esmraldi.spectraprocessing.closest_peak(num, indices_to_width)

Extracts the closest peak of index num.

Parameters
num: int

index

indices_to_width: dict

dictionary mapping indices to width

Returns
tuple

mz and associated width

esmraldi.spectraprocessing.min_step(mzs, max_len, starting_step=0.0005, incr_step=0.0005)
esmraldi.spectraprocessing.realign_reducing(out_spectra, spectra, step=0.0005, is_ppm=False)
esmraldi.spectraprocessing.realign_mean_spectrum(mzs, intensities, all_mzs, step=0.0005, is_ppm=False, return_stats=False)
esmraldi.spectraprocessing.realign_tree(spectra, mzs, mean_spectra, step=0.0005, is_ppm=False)
esmraldi.spectraprocessing.realign_wrt_peaks_mzs_generic(spectra, aligned_mzs)
esmraldi.spectraprocessing.realign_wrt_peaks_mzs(spectra, aligned_mzs, full_mzs, indices_to_width)

Realign spectra to reference peaks.

Parameters
spectra: np.ndarray

Spectra as [mz*I] array

aligned_mzs: list

reference mz peaklist

full_mzs: np.ndarray

complete mz peaklist over all spectra

indices_to_width: dict

mz to width

Returns
list

realigned spectra

esmraldi.spectraprocessing.realign_wrt_peaks(spectra, aligned_peaks, full_peaks, indices_to_width)

Realign spectra to reference peaks from indices.

Parameters
spectra: np.ndarray

Spectra as [mz*I] array

aligned_mzs: list

reference mz peaklist

full_mzs: np.ndarray

complete mz peaklist over all spectra

indices_to_width: dict

indices to width

Returns
list

realigned spectra indices

esmraldi.spectraprocessing.realign_indices(spectra, indices, reference='frequence', nb_occurrence=4, step=0.02, is_ppm=False)

Alignment function.

First extracts the peaks on all spectra, then extracts the reference peaks and maps each peak to its closest reference peak.

Parameters
spectra: np.ndarray

spectra

indices: np.ndarray

indices of peaks relative to spectra

Returns
list

realigned spectra

esmraldi.spectraprocessing.realign_mzs(spectra, mzs, reference='frequence', nb_occurrence=4, step=0.02, is_ppm=False)

Alignment function.

First extracts the peaks on all spectra based on local prominence, then extracts the reference peaks and maps each peak to its closest reference peak.

Parameters
spectra: np.ndarray

spectra

mzs: np.ndarray

peaklist of mzs ratio

Returns
list

realigned spectra

esmraldi.spectraprocessing.realign_generic(spectra, peaks, step=np.inf, is_ppm=False)
esmraldi.spectraprocessing.neighbours(index, n, spectra)

Right-sided neighbours of a point in a spectrum.

Parameters
index: int

index to search neighbours from

n: int

number of neighbours

spectra: np.ndarray

spectrum

Returns
np.ndarray

neighbours

esmraldi.spectraprocessing.forward_derivatives(peaks)

Forward derivatives from peak value distribution.

Parameters
peaks: list

peaklist

Returns
list

derivatives

esmraldi.spectraprocessing.find_isotopic_pattern(neighbours, tolerance, nb_charges)

Extracts isotopic pattern based on mz similarity and max number of charges.

Parameters
neighbours: np.ndarray

neighbouring mz in spectra

tolerance: float

acceptable mz delta for a peak to be considered isotopic

nb_charges: int

maximum number of charges

Returns
list

pattern peaklist

esmraldi.spectraprocessing.find_isotopic_pattern_theoretical_difference(neighbours, th_diff, tolerance, nb_charges, is_ppm=True)
esmraldi.spectraprocessing.peaks_max_intensity_isotopic_pattern(pattern)

Finds the peak with maximum intensity in the isotopic pattern.

Parameters
pattern: list

pattern peaklist

Returns
list

peak with maximum intensity

esmraldi.spectraprocessing.peaks_derivative_isotopic_pattern(pattern)

Finds the peak where the sign of the derivative changes from negative to positive from the pattern intensities.

Parameters
pattern: list

pattern peaklist

Returns
list

peaks where the derivative sign changes

esmraldi.spectraprocessing.isotopes_from_pattern(pattern, peaks_in_pattern)

Find isotopes from a pattern.

Parameters
pattern: list

pattern peaklist

peaks_in_pattern: list

peaks that correspond to other species in the pattern

Returns
np.ndarray

isotopes in the pattern

esmraldi.spectraprocessing.mz_second_isotope_most_abundant(average_distribution)

Determines where the second isotope becomes the most abundant from a given distribution.

Parameters
average_distribution: dict

maps atom mass to its average abundance

Returns
float

mass at which the second isotope is the most abundant

esmraldi.spectraprocessing.peak_to_index(peak, pattern)

Gets the index of a peak in a pattern

Parameters
peak: float

mass ratio

pattern: list

pattern peaklist

Returns
int

index of peak

esmraldi.spectraprocessing.deisotoping_simple(spectra, tolerance=0.1, nb_neighbours=8, nb_charges=5, average_distribution={})

Simple deisotoping depending on the mass of the secondmost abundant isotope:

  • Before this mass: uses the peak with max intensity as reference

  • After this mass: use the peak where the sign of the derivative changes

Parameters
spectra: np.ndarray

peaklist

tolerance: float

acceptable mz delta

nb_neighbours: int

size of patterns

nb_charges: int

maximum number of charges in isotopic pattern

average_distribution: dict

maps atom mass to its average abundance

Returns
np.ndarray

deisotoped spectra

esmraldi.spectraprocessing.deisotoping_simple_reference(spectra, th_diff=1.00335, tolerance=14, nb_neighbours=8, nb_charges=5, is_ppm=True)

Simple deisotoping depending on the mass of the secondmost abundant isotope:

  • Before this mass: uses the peak with max intensity as reference

  • After this mass: use the peak where the sign of the derivative changes

Parameters
spectra: np.ndarray

peaklist

tolerance: float

acceptable mz delta

nb_neighbours: int

size of patterns

nb_charges: int

maximum number of charges in isotopic pattern

average_distribution: dict

maps atom mass to its average abundance

Returns
np.ndarray

deisotoped spectra

esmraldi.spectraprocessing.deisotoping_reference_indices(peaks, th_diff=1.00335, tolerance=14, nb_neighbours=8, nb_charges=5, is_ppm=True)
esmraldi.spectraprocessing.subtract_spectra(target, source)
esmraldi.spectraprocessing.extract_mean_spectra_coordinates(spectra, coordinates, mzs, is_subtract, mean_spectra_matrix)