You are here: Home V2 Software Software More ... Developer Notes Data Model NmrCalc July 2011

NmrCalc July 2011

Use of NmrCalc in integration, data selection and merging issues, upcoming model changes.

Model Changes

  • Delete NmrCalc.DefaultParameter - we now use WmsProtocol for this kind of thing.
  • Change NmrCalc.EnergyTerm.constraintLists to derived and one-way.
    Add EnergyTerm.constraintStoreSerial and .constraintListSerials to hold link information
    NBNB Handled by making EnergyTerm a subclass of ConstraintStoreData.
  • Add ResonanceGroup.clusterCode (Line) attribute. (groups e.g. output from same assignment run)
    Add ResonanceGroup.isActive (Boolean, default:True) (inactive ResonanceGroups are generally ignored)
    Add constraint, so that only active ResonanceGroups can have chains, residue, and resonances links.
    See NmrCalc.SpinSystemData section below for the way this should work.
  • Add NmrCalc.PeakListData derived link 'peaks'
    Add PeakListData.peakSerials attribute.
  • Change NmrCalc.SpinSystemData.resopnanceGroups from 0..1 to *, change .resonanceGroupSerial to .resonanceGroupSerials,, and change derivation functions accordingly.
    Program that produce ResonanceGrooups will generally produce them in sets.
  • Add Symmetry.MolSymmetrySet.name (Line) and .details (Text).
  • Add WmsProtocol  -> 0..1 AnnealProtocol derived ink
  • Change derived links from MolStructure to MolSystem to optional, so that a missing MolSystem.atom does not render the MolStructure invalid.

OK. All above now done

  • POSSIBLY prohibit deletion of MolSystem.Atom if there are links to it (e.g. from AtomSet, FixedAtomSet).

NBNB this is not done but remains a potential PROBLEM. Deleting MolYSstem.Stoms can surreptiously change FIxedResonance assignments.

Merging

Merging data back can only work if the necessary data objects are still there. If the objects are missing we will throw an error, and leave the rest to format conversion. To minimise problems we must set isModifiable=False at project start for certain TopObjects, e.g. MolSystem, NmrConstraintStore.

Problems

  • Generating ConstraintLists data from MeasurementList data and how to integrate that.
  • When should MolSystem be changeable - currently no protection against various things breaking

 

Individual Data types

MeasurementListData, DerivedListData

Both values and assignments here can change over time, which makes both input and output hard to keep stable afterwards. This is especially hard  for shifts, that are recalculated continuously. The obvious solution, proposed here, is to use ConstraintStoreData instead, and convert ShitLists (etc.) to ChemShiftConstraintLists (etc.). This does mean that the ConstraintLiists must be generated explicitly in a separate step, otherwise you get a confusing situation where you select either a ShiftList or a ShiftConstraintList in order to get the same result.
Data Selection: Once the ChemShiftConstraintLists (e.g.) exist selection is unproblematical. You select the ConstraintStore and then select one of the lists.
Data Generation: NB This must be done as a separate step - at a minimum it must be made clear that we are generating and not selecting.  We need to filter on chain and residue, atom name, and figOfMerit. What is passed to program must be a list of measurements, not a MeasurementList. Best set-up is probably to select a list and add a filtering expression and a figOfMerit limit.
Input: By the time data are stored in NmrCalc, they must already be converted to NmrCalc.ConstraintStoreData. This preserves both values and assignments. We shall add new types of NmrConstraint.ConstraintList as necessary. There should be no MeasurementListData or DerivedListData of type input.
Output: Generate new AbstractMeasurementLists and DerivedDataLists, set isSimulated to True, and link using

MeasurementListData, DerivedListData.
Merging: Data go back to original NmrProject, with assignments being set from the atom assignments in the input ConstraintStore. Only requires the NmrProject to be there still.

 

PeakListData

Both values and assignments here can change over time, but PeakLists are too large and complex to copy, or to track alternative assignment states. People must convert to Constraints if they want to preserve input , or store program-calculated input as part of the output.
Data Selection:
Input seelction is an entire PeakList; people can make their own if they want more control. We must allow filtering on figOfMerit. For flexibility we should  pass a list of peaks to the actual code.
Input:
NmrCalc.PeakListData. If filtering on figOfMerit the new peakSerials attribute stores which peaks are being used.
Output: Create new PeakList with isSimulated=True.
Merging: Correct spectrum must be present, and identified through SpectrumData or PeakListData.

 

SpinSystemData

Used for automatic assignment programs, mostly as output, but just might be input as well.
Data Selection: Either automatic or one-by-one selection.
Input: Straightforward NmrCalc.SpinSystemData, linking to existing Resonances.
Output: Each calculation run creates an NmrCalc.SpinSystemData object, linking to Nmr.ResonanceGroups with isActive=False, identified by a common clusterCode. Assignment, assignmentPossibilities, links, and Resonance content is stored using ResidueTypeProb, ResonanceGroupProb, ResidueProb, and ResonanceProb. Programs can later convert (part of) the ResonanceGroups to active, setting links in the process using the XyzProb data., or transfer assignments to input ResonanceGroups in a separate step.
Merging: Can be added to the NmrProject, as long as it exists. Resonances may be created but will generally be pre-existing, so a mapping must be kept. If the program just assigns pre-existing ResonanceGroups (common case), there will be a ResonanceGroupProb link between the new and the input ResonanceGroup (if it still exists)

 

ConstraintStoreData, ViolationListData

Data Selection: Select ConstraintStore, then Multiselect box with serial and type of data within the Store. ConstraintStoreData may be generated from e.g. ShiftLists. If you want to use them repeatedly, you should select them as ConstraintLists, not continuously regenerate them, which is why the ConstraintList generation step should be explicit.
Input: Straightforward.
Output: New ConstraintLists and ViolationLists within same ConstraintStore. For ChemShiftConstraintList (and similar) should produce Nmr.ShiftList (etc.).
Merging: Should use input NmrConstraintStore if available, but can generate new  ConstraintStore at need. Input ConstraintStore should be set to unmodifiable on calculation start.

 

MolSystemData

Input only - no merging problems.
Data Selection:
First MolSystem, then individual chains (text entry or multi-select).
MolSymmetrySet must be generated somehow.  Simple situations (symmetric homopolymers) could be done with simple buttons, and created as part of program entry. The full complexity, where parts of chains can be symmetric, requires a dedicated editor and must be done in advance.

 

MolResidueData

Mostly input, and output will be a matter of selecting pre-existing residues. No merge problems, chainCode, residueSeqId can be set whether or not the corresponding object is there.
Data Selection: First select MolSystem (drop-down list), then enter residues as text (e.g. A:1-11,15,17-22; B:; C:23). NB we must use seqId, not seqCodes (otherwise we would need to cater for seqInsertCodes), this is a potential  source of confusion. If we want seqCodes we would have to give a selection list instead.

StructureEnsembleData

Data Selection: Drop-down list with molSystemCode,ensembleId, then a text box or model numbers. A selection table could be used but only makes sense if it contains more information than just the model numbers.
Merging: Output is a new StructureEnsemble. Requires the MolSystem to be still available, and this should be set to unModifiable during calculations.

 

SpectrumData

Mostly input.
Merging:
Producing a new DataSource you would need the Experiment to be present, at least in the most obvious way of working.

 

EnergyTerm

Input only, reference data. Not currently used. Unproblematical.

 

ExternalData

File pointers. Straightforward for static URLs. For data associated with project we should consider making a location tied to the Repository containing the project.

TensorData, FloatMatrixData,  RunParameter

String or numerical data, all straightforward.