spec2vec.SpectrumDocument module

class spec2vec.SpectrumDocument.SpectrumDocument(spectrum, n_decimals: int = 2)[source]

Bases: spec2vec.Document.Document

Create documents from spectra.

Every peak (and loss) positions (m/z value) will be converted into a string “word”. The entire list of all peak words forms a spectrum document. Peak words have the form “peak@100.32” (for n_decimals=2), and losses have the format “loss@100.32”. Peaks with identical resulting strings will not be merged, hence same words can exist multiple times in a document (e.g. peaks at 100.31 and 100.29 would lead to two words “peak@100.3” when using n_decimals=1).

For example:

import numpy as np
from matchms import Spectrum
from spec2vec import SpectrumDocument

spectrum = Spectrum(mz=np.array([100.0, 150.0, 200.51]),
                    intensities=np.array([0.7, 0.2, 0.1]),
                    metadata={'compound_name': 'substance1'})
spectrum_document = SpectrumDocument(spectrum, n_decimals=1)

print(spectrum_document.words)
print(spectrum_document.peaks.mz)
print(spectrum_document.get("compound_name"))

Should output

['peak@100.0', 'peak@150.0', 'peak@200.5']
[100.   150.   200.51]
substance1
__init__(spectrum, n_decimals: int = 2)[source]
Parameters
  • spectrum (SpectrumType) – Input spectrum.

  • n_decimals – Peak positions are converted to strings with n_decimal decimals. The default is 2, which would convert a peak at 100.387 into the word “peak@100.39”.

get(key: str, default=None)[source]

Retrieve value from Spectrum metadata dict. Shorthand for

val = self._obj.metadata[key]
property losses

Return losses of original spectrum.

property metadata

Return metadata of original spectrum.

property peaks

Return peaks of original spectrum.