esmraldi.spectraprocessing¶
Module for the preprocessing of spectra specifically designed for MALDI images
Peak picking
Local realignment procedures
Deisotoping
Module Contents¶
Functions¶
|
Computes the spectrum from the sum of all spectra. |
|
Computes the average spectrum. |
|
|
|
Computes the maximum intensity for each abscissa. |
|
Computes the minimum intensity for each abscissa. |
|
Estimates and extracts significant peaks in the spectra |
|
Estimates and extracts significant peaks in the spectra |
|
Estimates and extracts significant peaks in the spectra |
|
Estimates and extracts significant peaks in the spectra |
|
Estimates and extracts significant peaks in the spectra |
|
Estimates and extracts significant peaks |
|
Peak indices using continuous wavelet |
|
Peak detection using the Continuous Wavelet Transform. |
|
Generates spectra with common m/z values. |
|
|
|
TIC (total ion count) normalization. |
|
SIC (selective ion count) normalization. |
|
Makes groups of indices. |
|
|
|
Extracts the reference peak in a group, |
|
Extracts the reference peaks for several groups. |
|
Extracts the reference peak in a group |
|
Computes the width of a peak |
|
Computes the width of a peak |
|
Extracts the closest peak of index num. |
|
|
|
|
|
|
|
|
|
|
|
Realign spectra to reference peaks. |
|
Realign spectra to reference peaks |
|
Alignment function. |
|
Alignment function. |
|
|
|
Right-sided neighbours of a point in a spectrum. |
|
Forward derivatives from peak value distribution. |
|
Extracts isotopic pattern based on mz similarity |
|
|
|
Finds the peak with maximum intensity in |
|
Finds the peak where the sign of the derivative changes |
|
Find isotopes from a pattern. |
|
Determines where the second isotope becomes the most abundant |
|
Gets the index of a peak in a pattern |
|
Simple deisotoping depending on the mass of the |
|
Simple deisotoping depending on the mass of the |
|
|
|
|
|
- esmraldi.spectraprocessing.spectra_sum(spectra)¶
Computes the spectrum from the sum of all spectra.
- Parameters
- spectra: np.ndarray
Spectra as [mz*I] array
- Returns
- np.ndarray
Sum of spectra
- esmraldi.spectraprocessing.spectra_mean(spectra)¶
Computes the average spectrum.
- Parameters
- spectra: np.ndarray
Spectra as [mz*I] array
- Returns
- np.ndarray
Mean spectrum
- esmraldi.spectraprocessing.spectra_mean_centroided(spectra, mzs=None)¶
- esmraldi.spectraprocessing.spectra_max(spectra)¶
Computes the maximum intensity for each abscissa.
- Parameters
- spectra: np.ndarray
Spectra as [mz*I] array
- Returns
- np.ndarray
Max spectrum
- esmraldi.spectraprocessing.spectra_min(spectra)¶
Computes the minimum intensity for each abscissa.
- Parameters
- spectra: np.ndarray
Spectra as [mz*I] array
- Returns
- np.ndarray
Max spectrum
- esmraldi.spectraprocessing.spectra_peak_indices(spectra, prominence=50, wlen=10)¶
Estimates and extracts significant peaks in the spectra by using the prominence (height of the peak relative to the nearest higher peak).
- Parameters
- spectra: np.ndarray
Spectra as [mz*I] array
- prominence: int
threshold on prominence
- Returns
- np.ndarray
Peak indices relative to spectra
- esmraldi.spectraprocessing.spectra_peak_indices_adaptative(spectra, factor=1, wlen=10)¶
Estimates and extracts significant peaks in the spectra by using the local prominence (height of the peak relative to the background noise).
Background noise is estimated as the standard deviation of the signal over a window of size wlen.
- Parameters
- spectra: np.ndarray
Spectra as [mz*I] array
- factor: float
prominence factor
- wlen: int
size of the window
- Returns
- np.ndarray
Peak indices relative to spectra
- esmraldi.spectraprocessing.spectra_peak_mzs_adaptative(spectra, factor=1, wlen=10)¶
Estimates and extracts significant peaks in the spectra by using the local prominence (height of the peak relative to the background noise).
Background noise is estimated as the standard deviation of the signal over a window of size wlen.
- Parameters
- spectra: np.ndarray
Spectra as [mz*I] array
- factor: float
prominence factor
- wlen: int
size of the window
- Returns
- np.ndarray
Peaks m/z
- esmraldi.spectraprocessing.spectra_peak_indices_adaptative_noiselevel(spectra, factor=1, noise_level=1, wlen=10)¶
Estimates and extracts significant peaks in the spectra with specified noise level(s), by using the local prominence (height of the peak relative to the background noise)
- Parameters
- spectra: np.ndarray
Spectra as [mz*I] array
- factor: float
prominence factor
- noise_level: float or list
noise level
- wlen: int
size of the window
- Returns
- np.ndarray
Peaks m/z
- esmraldi.spectraprocessing.spectra_peak_mzs_adaptative_noiselevel(spectra, factor=1, noise_level=1, wlen=10)¶
Estimates and extracts significant peaks in the spectra with specified noise level(s), by using the local prominence (height of the peak relative to the background noise)
- Parameters
- spectra: np.ndarray
Spectra as [mz*I] array
- factor: float
prominence factor
- noise_level: float or list
noise level
- wlen: int
size of the window
- Returns
- np.ndarray
Peaks m/z
- esmraldi.spectraprocessing.peak_indices(data, prominence, wlen, distance=1)¶
Estimates and extracts significant peaks in the spectrum, by using the prominence (height of the peak relative to the nearest higher peak).
- Parameters
- spectra: np.ndarray
Spectra as [mz*I] array
- prominence: int
threshold on prominence
- Returns
- np.ndarray
Peak indices relative to spectrum
- esmraldi.spectraprocessing.peak_indices_cwt(data, factor, widths)¶
Peak indices using continuous wavelet transform.
- Parameters
- data: np.ndarray
Spectra as [mz*I] array
- factor: float
Threshold SNR
- widths: list
scales
- Returns
- np.ndarray
Peak indices relative to spectrum
- esmraldi.spectraprocessing.spectra_peak_mzs_cwt(spectra, factor, widths)¶
Peak detection using the Continuous Wavelet Transform.
- Parameters
- spectra: np.ndarray
Spectra as [mz*I] array
- factor: float
CWT threshold
- widths: list
Wavelet widths
- Returns
- np.ndarray
Detected peak m/z ratios
- esmraldi.spectraprocessing.same_mz_axis(spectra, tol=0)¶
Generates spectra with common m/z values.
Missing intensity values are added as np.nan.
- Parameters
- spectra: np.ndarray
Spectra as [mz*I] array
- tol: float
Tolerance to consider when two species are the same
- Returns
- np.ndarray
Spectra as [mz*I] array
- esmraldi.spectraprocessing.tic_values(spectra)¶
- esmraldi.spectraprocessing.normalization_tic(spectra, inplace=False)¶
TIC (total ion count) normalization.
Divides each intensity in a spectrum by the sum of all its intensities.
- Parameters
- spectra: np.ndarray
spectra as [mz*I] array
- Returns
- np.ndarray
normalized spectrum
- esmraldi.spectraprocessing.normalization_sic(spectra, indices_peaks, width_peak=10)¶
SIC (selective ion count) normalization.
Defined as : TIC - sum of peaks of high intensities Peaks are given with indices_peaks.
- Parameters
- spectra: np.ndarray
spectra as [mz*I] array
- indices_peaks: np.ndarray
indices peaks
- width_peak: int
average width of peaks
- Returns
- np.ndarray
normalized spectrum
- esmraldi.spectraprocessing.index_groups(indices, step=1, is_ppm=False)¶
Makes groups of indices.
For realignment and spatial selection.
- Parameters
- indices: list
list of peak indices
- step: int
threshold in indices to create groups
- Returns
- list
groups=list of list of peak indices
- esmraldi.spectraprocessing.index_groups_start_end(indices, step=1, is_ppm=False)¶
- esmraldi.spectraprocessing.peak_reference_indices_group(group)¶
Extracts the reference peak in a group, i.e. the most frequent in a group.
- Parameters
- group: list
list of peak indices
- esmraldi.spectraprocessing.peak_reference_indices_groups(groups)¶
Extracts the reference peaks for several groups.
- Parameters
- groups: list
groups=list of list of peak indices
- Returns
- list
list of reference peak indices
- esmraldi.spectraprocessing.peak_reference_indices_median(groups)¶
Extracts the reference peak in a group as the median peak.
- Parameters
- groups: list
list of peak indices
- Returns
- list
list of reference peak indices
- esmraldi.spectraprocessing.width_peak_mzs(aligned_mzs, groups, default=0.001)¶
Computes the width of a peak by computing the difference in m/z between the upper and lower bounds in the group.
- Parameters
- aligned_mzs: list
list of m/z
- groups: list
list of peak indices
- default: float
default width of the peak
- Returns
- dict
maps m/z to widths
- esmraldi.spectraprocessing.width_peak_indices(indices, full_indices)¶
Computes the width of a peak by checking neighboring indices.
- Parameters
- indices: np.ndarray
peak indices on the histogram of peaks
- full_indices: np.ndarray
peak indices on the spectra
- Returns
- dict:
peak indices to corresponding width
- esmraldi.spectraprocessing.closest_peak(num, indices_to_width)¶
Extracts the closest peak of index num.
- Parameters
- num: int
index
- indices_to_width: dict
dictionary mapping indices to width
- Returns
- tuple
mz and associated width
- esmraldi.spectraprocessing.min_step(mzs, max_len, starting_step=0.0005, incr_step=0.0005)¶
- esmraldi.spectraprocessing.realign_reducing(out_spectra, spectra, step=0.0005, is_ppm=False)¶
- esmraldi.spectraprocessing.realign_mean_spectrum(mzs, intensities, all_mzs, step=0.0005, is_ppm=False, return_stats=False)¶
- esmraldi.spectraprocessing.realign_tree(spectra, mzs, mean_spectra, step=0.0005, is_ppm=False)¶
- esmraldi.spectraprocessing.realign_wrt_peaks_mzs_generic(spectra, aligned_mzs)¶
- esmraldi.spectraprocessing.realign_wrt_peaks_mzs(spectra, aligned_mzs, full_mzs, indices_to_width)¶
Realign spectra to reference peaks.
- Parameters
- spectra: np.ndarray
Spectra as [mz*I] array
- aligned_mzs: list
reference mz peaklist
- full_mzs: np.ndarray
complete mz peaklist over all spectra
- indices_to_width: dict
mz to width
- Returns
- list
realigned spectra
- esmraldi.spectraprocessing.realign_wrt_peaks(spectra, aligned_peaks, full_peaks, indices_to_width)¶
Realign spectra to reference peaks from indices.
- Parameters
- spectra: np.ndarray
Spectra as [mz*I] array
- aligned_mzs: list
reference mz peaklist
- full_mzs: np.ndarray
complete mz peaklist over all spectra
- indices_to_width: dict
indices to width
- Returns
- list
realigned spectra indices
- esmraldi.spectraprocessing.realign_indices(spectra, indices, reference='frequence', nb_occurrence=4, step=0.02, is_ppm=False)¶
Alignment function.
First extracts the peaks on all spectra, then extracts the reference peaks and maps each peak to its closest reference peak.
- Parameters
- spectra: np.ndarray
spectra
- indices: np.ndarray
indices of peaks relative to spectra
- Returns
- list
realigned spectra
- esmraldi.spectraprocessing.realign_mzs(spectra, mzs, reference='frequence', nb_occurrence=4, step=0.02, is_ppm=False)¶
Alignment function.
First extracts the peaks on all spectra based on local prominence, then extracts the reference peaks and maps each peak to its closest reference peak.
- Parameters
- spectra: np.ndarray
spectra
- mzs: np.ndarray
peaklist of mzs ratio
- Returns
- list
realigned spectra
- esmraldi.spectraprocessing.realign_generic(spectra, peaks, step=np.inf, is_ppm=False)¶
- esmraldi.spectraprocessing.neighbours(index, n, spectra)¶
Right-sided neighbours of a point in a spectrum.
- Parameters
- index: int
index to search neighbours from
- n: int
number of neighbours
- spectra: np.ndarray
spectrum
- Returns
- np.ndarray
neighbours
- esmraldi.spectraprocessing.forward_derivatives(peaks)¶
Forward derivatives from peak value distribution.
- Parameters
- peaks: list
peaklist
- Returns
- list
derivatives
- esmraldi.spectraprocessing.find_isotopic_pattern(neighbours, tolerance, nb_charges)¶
Extracts isotopic pattern based on mz similarity and max number of charges.
- Parameters
- neighbours: np.ndarray
neighbouring mz in spectra
- tolerance: float
acceptable mz delta for a peak to be considered isotopic
- nb_charges: int
maximum number of charges
- Returns
- list
pattern peaklist
- esmraldi.spectraprocessing.find_isotopic_pattern_theoretical_difference(neighbours, th_diff, tolerance, nb_charges, is_ppm=True)¶
- esmraldi.spectraprocessing.peaks_max_intensity_isotopic_pattern(pattern)¶
Finds the peak with maximum intensity in the isotopic pattern.
- Parameters
- pattern: list
pattern peaklist
- Returns
- list
peak with maximum intensity
- esmraldi.spectraprocessing.peaks_derivative_isotopic_pattern(pattern)¶
Finds the peak where the sign of the derivative changes from negative to positive from the pattern intensities.
- Parameters
- pattern: list
pattern peaklist
- Returns
- list
peaks where the derivative sign changes
- esmraldi.spectraprocessing.isotopes_from_pattern(pattern, peaks_in_pattern)¶
Find isotopes from a pattern.
- Parameters
- pattern: list
pattern peaklist
- peaks_in_pattern: list
peaks that correspond to other species in the pattern
- Returns
- np.ndarray
isotopes in the pattern
- esmraldi.spectraprocessing.mz_second_isotope_most_abundant(average_distribution)¶
Determines where the second isotope becomes the most abundant from a given distribution.
- Parameters
- average_distribution: dict
maps atom mass to its average abundance
- Returns
- float
mass at which the second isotope is the most abundant
- esmraldi.spectraprocessing.peak_to_index(peak, pattern)¶
Gets the index of a peak in a pattern
- Parameters
- peak: float
mass ratio
- pattern: list
pattern peaklist
- Returns
- int
index of peak
- esmraldi.spectraprocessing.deisotoping_simple(spectra, tolerance=0.1, nb_neighbours=8, nb_charges=5, average_distribution={})¶
Simple deisotoping depending on the mass of the secondmost abundant isotope:
Before this mass: uses the peak with max intensity as reference
After this mass: use the peak where the sign of the derivative changes
- Parameters
- spectra: np.ndarray
peaklist
- tolerance: float
acceptable mz delta
- nb_neighbours: int
size of patterns
- nb_charges: int
maximum number of charges in isotopic pattern
- average_distribution: dict
maps atom mass to its average abundance
- Returns
- np.ndarray
deisotoped spectra
- esmraldi.spectraprocessing.deisotoping_simple_reference(spectra, th_diff=1.00335, tolerance=14, nb_neighbours=8, nb_charges=5, is_ppm=True)¶
Simple deisotoping depending on the mass of the secondmost abundant isotope:
Before this mass: uses the peak with max intensity as reference
After this mass: use the peak where the sign of the derivative changes
- Parameters
- spectra: np.ndarray
peaklist
- tolerance: float
acceptable mz delta
- nb_neighbours: int
size of patterns
- nb_charges: int
maximum number of charges in isotopic pattern
- average_distribution: dict
maps atom mass to its average abundance
- Returns
- np.ndarray
deisotoped spectra
- esmraldi.spectraprocessing.deisotoping_reference_indices(peaks, th_diff=1.00335, tolerance=14, nb_neighbours=8, nb_charges=5, is_ppm=True)¶
- esmraldi.spectraprocessing.subtract_spectra(target, source)¶
- esmraldi.spectraprocessing.extract_mean_spectra_coordinates(spectra, coordinates, mzs, is_subtract, mean_spectra_matrix)¶