Formulating a Conservation of Mass Balance Problem RRS feed

  • Question

  • We are having difficulty getting reasonable results from our model and solicit comments and suggestions. The model is described below:

    We are formulating a mass balance (conservation) inference engine.  The system being analyzed consists two types of objects: flow meters and physical sections.

    A Flow Meter represents a single flow rate between two nodes in the system.

    A Physical Section has a mass that can increase or decrease.  This rate of increase of mass is called the Packing Rate.  The flow meter flow rates have the same units as the packing rates (e.g. kg / sec). 

    A simple non-branching configuration would be a flow meter connected to a physical section that is connected to another flow meter and so on.  A Volume Balance Section (Vbs) consists of an upstream flow meter(s), a downstream flow meter(s) and 0 or 1 Physical Segments between the Flow Meters.

    Mass conservation requires that the sum of the flows into the Vbs minus the flows out of the Vbs balance exactly match the packing rate of physical segment (if any) in the Vbs.  I.e, sum(flows in) - sum(flows out) = packing rate of Physical Segment. 

    We have a series of observations in time of the observed flow rates and observed packing rates.  We call these observations "snapshots".

    We assume that the True flow rate of each flow meter at each snapshot is the observed value minus the meter error (a Gaussian). 

    We assume that the True packing rate of each physical segment at each snapshot is the observed value minus an error that that is the sum of two components: 1) the product of a fractional error (a Gaussian) and the observed packing rate; and 2) an additive error (a Gaussian).

    Our first question has to do with the best way to impose the mass balance constraint.  In our code, it is expressed using an equality constraint as follows:

    Variable<double> Zero = Variable.New<double>().Named("Zero");
    Zero.ObservedValue = 0;         
                = Variable.Sum(vbsTrueInFlows[ISnapshot][IVbs]).Named("totalVbsTrueInFlow")
                - Variable.Sum(vbsTrueOutFlows[ISnapshot][IVbs]).Named("totalVbsTrueOutFlow");
    TrueVolumeBalance[IVbs][ISnapshot] = TrueFlowBalance[IVbs][ISnapshot] -TruePackingRate[IVbs][ISnapshot];
    Variable.ConstrainEqual(TrueVolumeBalance[IVbs][ISnapshot], Zero);

    Is there a better way to impose the constraint?

    The only Inference Engine that is working with our code is Expectation Propagation.  We are unclear as to why this is.

    Our model is not giving the results we would expect.  We are applying a very noisy signal to one meter (large standard deviation) and would expect that that after imposing 1000 observations that that meter's error marginal error would have a much larger variance than the other meters.  However, it all meters have similar variances.  We need to understand if it is a fault of our model, our understanding, or Infer.Net. 

    The full model code is listed below.  Any comments or suggestions will be much appreciated. - Thanks, Ed

       public class SimpleVolumeBalanceSections
          private bool IsModelCreated { get; set; }
          #region Range Variables
          /// <summary>
          /// Range variable to iterate over the Meters flowing into a VBS.
          /// </summary>
          public Range IVbsInMeter { get; set; }
          /// <summary>
          /// Range variable to iterate over the Meters flowing out of a VBS
          /// </summary>
          public Range IVbsOutMeter { get; set; }
          /// <summary>
          /// Range variable to iterate over the volume balance sections 
          /// </summary>
          public Range IVbs { get; set; }
          /// <summary>
          /// Range variable to iterate over all Meters
          /// </summary>
          public Range IMeter { get; set; }
          /// <summary>
          /// Range variable to iterate over all data snapshots.
          /// </summary>
          public Range ISnapshot { get; set; }
          #region Physical Configuration Variables
          // The variables in this section define the physical configuration of the
          // the Volume Balance Sections
          /// <summary>
          /// Number of total meters
          /// </summary>
          public Variable<int> NumMeters { get; set; }
          /// <summary>
          /// Number of total volume balance sections
          /// </summary>
          public Variable<int> NumVbs { get; set; }
          /// <summary>
          /// Number of meters into each volume balance section.  The range
          /// variable for this array is IVbs
          /// </summary>
          public VariableArray<int> NumVbsInMeters { get; set; }
          /// <summary>
          /// Number of meters out of each volume balance section.  The range
          /// variable for this array is IVbs
          /// </summary>
          public VariableArray<int> NumVbsOutMeters { get; set; }
          /// <summary>
          /// Lists the meters into each VBS.  This is indexed using range
          /// variables as VbsInMeters[IVbs][IVbsInMeter]
          /// </summary>
          public VariableArray<VariableArray<int>, int[][]> VbsInMeters { get; set; }
          /// <summary>
          /// Lists the meters out of each VBS.  This is indexed using range
          /// variables as VbsOutMeters[IVbs][IVbsOutMeter]
          /// </summary>
          public VariableArray<VariableArray<int>, int[][]> VbsOutMeters { get; set; }
          #region Observations
          /// <summary>
          /// Number of data snapshots used for inference
          /// </summary>
          public Variable<int> NumSnapshots { get; set; }
          /// <summary>
          /// This variable is flow meter data snapshots indexed by range variables as
          /// MeterFlow[IMeter][ISnapshot].  
          /// </summary>
          public VariableArray<VariableArray<double>, double[][]> MeterFlowSnapshots { get; set; }
          /// <summary>
          /// This variable represents the model computed packing rates.  It is 
          /// indexed by range variables as
          /// PackingRate[IVbs][ISnapshot]
          /// </summary>
          public VariableArray<VariableArray<double>, double[][]> PackingRateSnapshots { get; set; }
          #region Priors
          public VariableArray<Gaussian> PriorMeterErrorMean { get; set; }
          public VariableArray<Gamma> PriorMeterErrorPrecision { get; set; }
          public VariableArray<Gaussian> PriorPackingRateAdditiveErrorMean { get; set; }
          public VariableArray<Gamma> PriorPackingRateAdditiveErrorPrecision { get; set; }
          public VariableArray<Gaussian> PriorPackingRateFractionalErrorMean { get; set; }
          public VariableArray<Gamma> PriorPackingRateFractionalErrorPrecision { get; set; }
          #region Modelled and Constrained Variables
          public VariableArray<VariableArray<double>, double[][]> TrueVolumeBalance { get; set; }
          public VariableArray<VariableArray<double>, double[][]> TrueFlowBalance { get; set; }
          public VariableArray<double> MeterErrorPrecision { get; set; }
          public VariableArray<double> MeterErrorMean { get; set; }
          public VariableArray<VariableArray<double>, double[][]> MeterFlowError { get; set; }
          public VariableArray<double> PackingRateAdditiveErrorPrecision { get; set; }
          public VariableArray<double> PackingRateAdditiveErrorMean { get; set; }
          public VariableArray<VariableArray<double>, double[][]> PackingRateAdditiveError { get; set; }
          public VariableArray<VariableArray<double>, double[][]> TrueFlow { get; set; }
          public VariableArray<VariableArray<double>, double[][]> TruePackingRate { get; set; }
          public VariableArray<double> PackingRateFractionalErrorPrecision { get; set; }
          public VariableArray<double> PackingRateFractionalErrorMean { get; set; }
          public VariableArray<VariableArray<double>, double[][]> PackingRateFractionalError { get; set; }
          #region Inference Engine and Posteriors
          public InferenceEngine InferenceEngine { get; set; }
          public Gaussian[] MarginalPackingRateFractionalErrorMean { get; set; }
          public Gamma[] MarginalPackingRateFractionalErrorPrecision { get; set; }
          public Gaussian[][] MarginalPackingRateFractionalError { get; set; }
          public Gaussian[] MarginalMeterErrorMean { get; set; }
          public Gamma[] MarginalMeterErrorPrecision { get; set; }
          public Gaussian[][] MarginalMeterError { get; set; }
          public Gaussian[] MarginalPackingRateAdditiveErrorMean { get; set; }
          public Gamma[] MarginalPackingRateAdditiveErrorPrecision { get; set; }
          public Gaussian[][] MarginalPackingRateAdditiveError { get; set; }
          public Gaussian[][] MarginalTrueFlow { get; set; }
          public Gaussian[][] MarginalTruePackingRate { get; set; }
          public SimpleVolumeBalanceSections()
             InferenceEngine = null;
             IsModelCreated = false;
             NumMeters = Variable.New<int>().Named("NumMeters");
             NumSnapshots = Variable.New<int>().Named("NumSamples");
             NumVbs = Variable.New<int>().Named("NumVbs");
             IMeter = new Range(NumMeters).Named("meter");
             ISnapshot = new Range(NumSnapshots).Named("snapshot");
             IVbs = new Range(NumVbs).Named("vbs");
             NumVbsInMeters = Variable.Array<int>(IVbs).Named("numVbsInMeters");
             NumVbsOutMeters = Variable.Array<int>(IVbs).Named("numVbsOutMeters");
             IVbsInMeter = new Range(NumVbsInMeters[IVbs]).Named("inMeter");
             VbsInMeters = Variable.Array(Variable.Array<int>(IVbsInMeter), IVbs).Named("vbsInMeters");
             IVbsOutMeter = new Range(NumVbsOutMeters[IVbs]).Named("outMeter");
             VbsOutMeters = Variable.Array(Variable.Array<int>(IVbsOutMeter), IVbs).Named("vbsOutMeters");
             // Fundamental / core variables
             TrueFlow = Variable.Array(Variable.Array<double>(ISnapshot), IMeter).Named("TrueFlow");
             MeterFlowSnapshots = Variable.Array(Variable.Array<double>(ISnapshot), IMeter).Named("MeterFlow");
             TruePackingRate = Variable.Array(Variable.Array<double>(ISnapshot), IVbs).Named("TruePackingRate");
             PackingRateSnapshots = Variable.Array(Variable.Array<double>(ISnapshot), IVbs).Named("PackingRate");
             TrueFlowBalance = Variable.Array(Variable.Array<double>(ISnapshot), IVbs).Named("TrueFlowBalance");
             TrueVolumeBalance = Variable.Array(Variable.Array<double>(ISnapshot), IVbs).Named("TrueVolumeBalance");
             // Variables that represent the measurement and computational errors
             MeterFlowError = Variable.Array(Variable.Array<double>(ISnapshot), IMeter).Named("MeterFlowError");
             MeterErrorMean = Variable.Array<double>(IMeter).Named("MeterErrorMean");
             MeterErrorPrecision = Variable.Array<double>(IMeter).Named("MeterErrorPrecision");
             PriorMeterErrorMean = Variable.Array<Gaussian>(IMeter).Named("PriorMeterErrorMean");
             PriorMeterErrorPrecision = Variable.Array<Gamma>(IMeter).Named("PriorMeterErrorPrecision");
             MeterErrorMean[IMeter] = Variable<double>.Random(PriorMeterErrorMean[IMeter]);
             MeterErrorPrecision[IMeter] = Variable<double>.Random(PriorMeterErrorPrecision[IMeter]);
             PackingRateAdditiveError = Variable.Array(Variable.Array<double>(ISnapshot), IVbs).Named("PackingRateAdditiveError");
             PackingRateAdditiveErrorMean = Variable.Array<double>(IVbs).Named("PackingRateAdditiveErrorMean");
             PackingRateAdditiveErrorPrecision = Variable.Array<double>(IVbs).Named("PackingRateAdditiveErrorPrecision");
             PriorPackingRateAdditiveErrorMean = Variable.Array<Gaussian>(IVbs).Named("PriorPackingRateAdditiveErrorMean");
             PriorPackingRateAdditiveErrorPrecision = Variable.Array<Gamma>(IVbs).Named("PriorPackingRateAdditiveErrorPrecision");
             PackingRateAdditiveErrorMean[IVbs] = Variable<double>.Random(PriorPackingRateAdditiveErrorMean[IVbs]);
             PackingRateAdditiveErrorPrecision[IVbs] = Variable<double>.Random(PriorPackingRateAdditiveErrorPrecision[IVbs]);
             PackingRateFractionalError = Variable.Array(Variable.Array<double>(ISnapshot), IVbs).Named("PackingRateFractionalError");
             PackingRateFractionalErrorMean = Variable.Array<double>(IVbs).Named("PackingRateFractionalErrorMean");
             PackingRateFractionalErrorPrecision = Variable.Array<double>(IVbs).Named("PackingRateFractionalErrorPrecision");
             PriorPackingRateFractionalErrorMean = Variable.Array<Gaussian>(IVbs).Named("PriorPackingRateFractionalErrorMean");
             PriorPackingRateFractionalErrorPrecision = Variable.Array<Gamma>(IVbs).Named("PriorPackingRateFractionalErrorPrecision");
             PackingRateFractionalErrorMean[IVbs] = Variable<double>.Random(PriorPackingRateFractionalErrorMean[IVbs]);
             PackingRateFractionalErrorPrecision[IVbs] = Variable<double>.Random(PriorPackingRateFractionalErrorPrecision[IVbs]);
          public void SetModelData(
             int numberOfBalanceSections,
             Gaussian[] priorMeterErrorMean,
             Gamma[] priorMeterErrorPrecision,
             Gaussian[] priorPackingRateAdditiveErrorMean,
             Gamma[] priorackingRateAdditiveErrorPrecision,
             Gaussian[] priorPackingRateFractionalErrorMean,
             Gamma[] prioPackingRateFractionalErrorPrecision,
             double[][] meterReadings,
             double[][] observedPackingRates, 
             int[][] flowsIn, 
             int[][] flowsOut)
             if (!IsModelCreated)
             ObserveData(meterReadings, observedPackingRates);
          private void SetPhysicalConfiguration(int numberOfBalanceSections,
             int numberOfMeters,
             int[][] vbsInMeters,
             int[][] vbsOutMeters)
             NumMeters.ObservedValue = numberOfMeters;
             NumVbs.ObservedValue = numberOfBalanceSections;
             NumVbsInMeters.ObservedValue = vbsInMeters.Select(r => r.Length).ToArray();
             NumVbsOutMeters.ObservedValue = vbsOutMeters.Select(r => r.Length).ToArray();
             VbsInMeters.ObservedValue = vbsInMeters;
             VbsOutMeters.ObservedValue = vbsOutMeters;
          private void CreateModel()
             MeterFlowError[IMeter][ISnapshot] = Variable.GaussianFromMeanAndPrecision(
                      MeterErrorMean[IMeter], MeterErrorPrecision[IMeter]).ForEach(ISnapshot);
             TrueFlow[IMeter][ISnapshot] = MeterFlowSnapshots[IMeter][ISnapshot] 
                - MeterFlowError[IMeter][ISnapshot];
             PackingRateAdditiveError[IVbs][ISnapshot] = Variable.GaussianFromMeanAndPrecision(
             PackingRateFractionalError[IVbs][ISnapshot] = Variable.GaussianFromMeanAndPrecision(
             var packingRateError1 = Variable.Array(Variable.Array<double>(ISnapshot), IVbs).Named("packingRateError1");
             packingRateError1[IVbs][ISnapshot] = (PackingRateSnapshots[IVbs][ISnapshot] * PackingRateFractionalError[IVbs][ISnapshot]);
             var packingRateError = Variable.Array(Variable.Array<double>(ISnapshot), IVbs).Named("packingRateError");
             packingRateError[IVbs][ISnapshot] = PackingRateAdditiveError[IVbs][ISnapshot] + packingRateError1[IVbs][ISnapshot];
             TruePackingRate[IVbs][ISnapshot] = PackingRateSnapshots[IVbs][ISnapshot]
                - packingRateError[IVbs][ISnapshot];
             Variable<double> Zero = Variable.New<double>().Named("Zero");
             Zero.ObservedValue = 0;
             var vbsTrueInFlows = Variable.Array(Variable.Array(Variable.Array<double>(IVbsInMeter), IVbs), ISnapshot).Named("vbsTrueInFlows");
             var vbsTrueOutFlows = Variable.Array(Variable.Array(Variable.Array<double>(IVbsOutMeter), IVbs), ISnapshot).Named("vbsTrueOutFlows");
             vbsTrueInFlows[ISnapshot][IVbs][IVbsInMeter] = TrueFlow[VbsInMeters[IVbs][IVbsInMeter]][ISnapshot];
             vbsTrueOutFlows[ISnapshot][IVbs][IVbsOutMeter] = TrueFlow[VbsOutMeters[IVbs][IVbsOutMeter]][ISnapshot];
                = Variable.Sum(vbsTrueInFlows[ISnapshot][IVbs]).Named("totalVbsTrueInFlow")
                - Variable.Sum(vbsTrueOutFlows[ISnapshot][IVbs]).Named("totalVbsTrueOutFlow");
             TrueVolumeBalance[IVbs][ISnapshot] = TrueFlowBalance[IVbs][ISnapshot] -TruePackingRate[IVbs][ISnapshot];
             Variable.ConstrainEqual(TrueVolumeBalance[IVbs][ISnapshot], Zero);
             IsModelCreated = true;
          public void InferPosteriors(InferenceEngine inferenceEngine)
             // Retrieve the posterior distributions for each meter and section:
             MarginalMeterErrorMean = inferenceEngine.Infer<Gaussian[]>(MeterErrorMean);
             MarginalMeterErrorPrecision = inferenceEngine.Infer<Gamma[]>(MeterErrorPrecision);
             MarginalPackingRateFractionalErrorMean = inferenceEngine.Infer<Gaussian[]>(PackingRateFractionalErrorMean);
             MarginalPackingRateFractionalErrorPrecision = inferenceEngine.Infer<Gamma[]>(PackingRateFractionalErrorPrecision);
             MarginalPackingRateAdditiveErrorMean = inferenceEngine.Infer<Gaussian[]>(PackingRateAdditiveErrorMean);
             MarginalPackingRateAdditiveErrorPrecision = inferenceEngine.Infer<Gamma[]>(PackingRateAdditiveErrorPrecision);
             MarginalTrueFlow = inferenceEngine.Infer<Gaussian[][]>(TrueFlow);
             MarginalMeterError = inferenceEngine.Infer<Gaussian[][]>(MeterFlowError);
             MarginalTruePackingRate = inferenceEngine.Infer<Gaussian[][]>(TruePackingRate);
             MarginalPackingRateAdditiveError = inferenceEngine.Infer<Gaussian[][]>(PackingRateAdditiveError);
             MarginalPackingRateFractionalError = inferenceEngine.Infer<Gaussian[][]>(PackingRateFractionalError);
          public void InferPosteriors()
             if (InferenceEngine == null)
                throw new ApplicationException("SimpleVolumeBalance.InferenceEngine must be specified before calling InferPosteriors()");
          private void SetPriors(Gaussian[] priorMeterErrorMean,
             Gamma[] priorMeterErrorPrecision,
             Gaussian[] priorPackingRateAdditiveErrorMean,
             Gamma[] priorPackingRateAdditiveErrorPrecision,
             Gaussian[] priorPackingRateFractionalErrorMean,
             Gamma[] priorPackingRateFractionalErrorPrecision
             // Set Priors
             PriorMeterErrorMean.ObservedValue = priorMeterErrorMean;
             PriorMeterErrorPrecision.ObservedValue = priorMeterErrorPrecision;
             PriorPackingRateAdditiveErrorMean.ObservedValue = priorPackingRateAdditiveErrorMean;
             PriorPackingRateAdditiveErrorPrecision.ObservedValue = priorPackingRateAdditiveErrorPrecision;
             PriorPackingRateFractionalErrorMean.ObservedValue = priorPackingRateFractionalErrorMean;
             PriorPackingRateFractionalErrorPrecision.ObservedValue = priorPackingRateFractionalErrorPrecision;


    Monday, May 14, 2012 9:42 PM

All replies

  • Yes, there is a better way to impose the constraint.  You can write the model instead as a generative model, where you are generating the observations from the latent TrueFlows:

    TrueFlow[IMeter][ISnapshot] = Variable.GaussianFromMeanAndPrecision(...); MeterFlowSnapshots[IMeter][ISnapshot] = Variable.GaussianFromMeanAndPrecision(TrueFlow[IMeter][ISnapshot] +
    MeterErrorMean[IMeter], MeterErrorPrecision[IMeter]); TruePackingRate[IVbs][ISnapshot] = Variable.Copy(TrueFlowBalance[IVbs][ISnapshot]); PackingRateSnapshots[IVbs][ISnapshot] = Variable.GaussianFromMeanAndPrecision(
    TruePackingRate[IVbs][ISnapshot] * PackingRateFractionalError[IVbs][ISnapshot] +
    PackingRateAdditiveErrorMean[IVbs], PackingRateAdditiveErrorPrecision[IVbs]);

    Not only does this make inference easier, but I think it is a more well-defined model, since it makes your prior knowledge about the TrueFlows explicit.  This formulation should work equally well with EP, VMP, and Gibbs Sampling.

    Your second question was about recovering the error variances.  This may be because of the symmetry of the model.  If you are using a symmetric prior and symmetric initialisation, then I don't think it is possible to determine the source of the error.

    Wednesday, May 16, 2012 1:34 PM
  • Tom,

    Thank you for your response.  I've been trying variations on your suggestions but have not resolved the model to my satisfaction. I am still puzzling over the best way to deal with it. 

    1) I liked the way you avoided the Constrain.Equal but using the Variable.Copy statement.

    2) I'm not comfortable with the statement "it makes your prior knowledge about the TrueFlows explicit".  I really don't have any prior knowledge about the true flows other than that any value between 0 and 25,000 is reasonable.  Instead, it seems to me that I have much more prior knowledge about the flowMeterError.

    3) I remain puzzled about why I can't recover the error variances.  For example, take the simple case of 5 meters connected end to end in a linear train (ignoring packing rate).  If one meter is noisy and the others are not, I know that I can pick out the bad actor by inspection of the data.  For example, if I number my meters as 1 - 5 and let meter 3 be the noisy meter, the imbalance between meters 1 & 2, 1 & 4, 1 & 5, 2 & 4, 2 & 5, and 4 & 5 are small but those between 1 & 3, 2 & 3, 3 & 4, and 3 & 5 are large.  Since balances with meter 3 are all equally noisy but all other balances are all equally "quiet", I know that meter 3 is the noisy meter.  I thought that this would somehow fall out implicitly from the model.  I presume that I could accomplish my goal by creating a series of models that express the balances between every two sets of meters and somehow then tie together the results.  But, I was expecting that the inferencing engine would figure this out for me. 

    For example, I offer the following argument:

    If there are 100 meters and only one has a large error E, the other 99 meters will have to be adjusted by the some amount, say "x" and the one with the error must be adjusted by E-x for all flow to balance.  Since there are 99 meters being adjusted by x and only one by E-x, on every snapshot of data, |x| < |E - x|.  Another way of saying it is that |x| = |E - x| is less probable than |x| < |E - x|.  Then, if we have thousands of snapshots for which this is true, over time, the variance of the error of the one problem meter should increase relative to the other meters.

    I look forward to your comments. Am I missing something fundamental here?


    Wednesday, June 6, 2012 4:09 AM
  • 2) Prior knowledge can be vague or precise.  Even if your prior knowledge is vague, you can still be explicit about it.

    3) I agree that if the packing rates are noise-free then you should be able to work out the noise on the meters.  Are you setting the prior so that packing rates are known to have less noise than meters?

    Wednesday, June 6, 2012 12:58 PM
  • Yes, I have tried taking packing rates out of the model altogether.  I'm trying some variations on your ideas.  I wonder if it makes a difference whether I train the network a snapshot at a time, infer the marginals and then feed them back into the priors rather than providing a training set of a 1000 snapshots.

    A collegue of mine has been able to get a variation of your proposed approach working. He found that the Gibbs Sampling gave results that he expected but expectation propagation did not.  Does that make sense to you?



    Friday, June 8, 2012 3:38 AM
  • I don't know what you mean by "taking packing rates out of the model," but that doesn't sound the same as setting their noise level to zero.  As for Gibbs Sampling versus EP, this can always happen due to the different approximations the algorithms are making.  In this model, if EP has a problem it is probably due to using precision priors that are too vague.  I can't tell from the code above what priors you are using.
    Friday, June 8, 2012 10:42 AM