-
Notifications
You must be signed in to change notification settings - Fork 1
/
Copy pathMINT_MS2SMILES_trainer.yaml
59 lines (48 loc) · 2.16 KB
/
MINT_MS2SMILES_trainer.yaml
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
MINT_MS2SMILES_trainer:
MSP Pickling:
Perform MSP pickling: True
Directory to store deconvoluted PKL file: path/to/folder
Directory to MSP files: path/to/folder
MSP files: LIPIDMSPs_NEG.msp # A string OR a list of msp files in [brackets]
Minimum m/z: 50
Maximum m/z: 1000
Interval m/z: 0.01 # This parameters is also used as a maximum mass deviation parameter
Minimum number of peaks: 5
Maximum number of peaks: 512
Noise removal threshold: 0.01
Allowed spectral entropy: True
Maximum length of SMILES characters: 200
Number of CPU processing threads: 35
Model Parameters:
Number of m/z tokens: 95003 # This parameter calculated using: 3 + (Maximum m/z - Minimum m/z)/Interval m/z
Dimension of model: 512 # general dimension of the model
Embedding norm of m/z tokens: 2
Dropout probability of embedded m/z: 0.1
Maximum length of SMILES characters: 200 # The same as line 16
Embedding norm of SMILES: 2
Embedding norm of SMILES sequence: 2
Dropout probability of embedded SMILES sequence: 0.1
Number of attention heads: 2
Number of encoder layers: 4
Number of decoder layers: 4
Dropout probability of transformer: 0.1
Activation function: relu # relu OR glue
Training Parameters:
Reset model weights: True
## Provide model address to further train an exisitng mdoel (transfer learning), and select `False` for `Reset model weights`
Model address to train: path/to/folder/MINT_MS2SMILES_model.pth
Directory to store the trained model: path/to/folder
Directory to load deconvoluted PKL file: path/to/folder # The same as line 6
Device: None # cuda OR cpu. When None, it automatically finds the processing device.
Cross entropy LOSS function:
Label smoothing: 0.1
Adam optimizer function:
Learning rate: 1e-5
Beta1: 0.9
Beta2: 0.98
Epsilon: 1e-09
Maximum number of epochs: 300
Maximum number of ions per training step: 2000
Split ratio between training and validation sets: [0.80, 0.20]
Random state: 67
Number of CPU processing threads: 35