locked
Change Dictionary values while iterating RRS feed

  • Question

  • I am looking for an efficient solution to the following problem: I have a dictionary with a significant number of items in it. I now need to iterate through all items, look at each value and based on a condition perform a transformation on it. As a result some item values are updated. Here is some pseudo-code (the actual objects in the dictionary are not strings, but that does not matter here):

                foreach (KeyValuePair<string, string> item in dictionary)
                {
                    if (condition(item.Value))
                        item.Value = transform(item.Value);
                }

    Obviously, this does not work, as the value property is read-only (for no obvious reason though...I can see why Key is readonly, but not value).

    What is an efficient approach to doing this?

    Note, that I would not want to copy all values into a new dictionary, e.g.

                foreach (KeyValuePair<string, string> item in dictionary)
                {
                    if (condition(item.Value))
                        newDict[item.Key] = transform(item.Value);
                    else
                        newDict[item.Key] = item.Value;
                }

    as this is allocating another dictionary and performs a large number of unneccessairy operations if only few items actually need to be transformed.
    I was considering something like

                ICollection<string> keys = new List<string>(dictionary.Keys);
                foreach (string k in keys)
                {
                    String val = dictionary[k];
                    if (condition(val))
                        dictionary[k] = transform(val);
                }


    i.e. copy all keys and iterate over them instead of the actual items. This seems to be inefficient as well, as it does a large number of lookups with the [] operator.

    Any ideas?

    Wednesday, April 23, 2008 12:04 PM

Answers

  • You can do this by having the dictionary store references instead of values.  Here's an example:

        private class DictValue {
          public DictValue(int v) { iValue = v; }
          public int iValue;
        }

        private Dictionary<string, DictValue> mDict = new Dictionary<string, DictValue>();

        private void Test() {
          mDict.Add("zero", new DictValue(0));
          mDict.Add("one", new DictValue(1));
          foreach (KeyValuePair<string, DictValue> v in mDict)
            v.Value.iValue += 1;
          foreach (KeyValuePair<string, DictValue> v in mDict)
            Console.WriteLine("{0} = {1}", v.Key, v.Value.iValue);
        }

    Output:
    zero = 1
    one = 2
    Friday, April 25, 2008 12:44 PM

All replies

  • Because condition is a function operating on the value instead of on the key, you have no other option than to iterate over all elements in the dictionary.

    I don't know your design of course but if you had a dictionary that had the result of the condition on the value as key, and you were to use a container that supports multiple indexes, then you could achieve greater efficience.

     

    If you stick to the current container, you have no option other than iterating the container. Taking steps like creating a temporary string list will not help.

    Wednesday, April 23, 2008 12:16 PM
  • To further illustrate the case, lets assume that the map holds element IDs (e.g. for html divs) as keys and colors as values. Colors can either be #nnnnnn or "red", "green", etc.
    The algorithm is supposed to replace non-#nnnnnn colors with their corresponding color code, i.e. replace "red" with #FF0000.
    (Note that his is an analogy to the actual problem, which involves much more complicated data structures).

    I am not worried about iterating all elements as I know I have to inspect every value. I am worried that I have to copy the entire dictionary with perhaps 100 entries, just because 1 or 2 values need to be replaced.
    Wednesday, April 23, 2008 12:30 PM
  • But you can change values.

    Like this:

     

    Code Snippet

    Dictionary<string,string> dic = new Dictionary<string,string>();

    dic.Add("1","11");

    dic["1"] = "3";

     

     

    You stated that values could not be changed, but they can be changed.
    Wednesday, April 23, 2008 12:55 PM
  •  

    If your key is small then I would consider storing it twize, and then when you add it to the dictionary then also adding it to a list.

     

    Then you can walkthrough the list, and change dictionary based on the list

     

    List<string> list = new List<string>;

     

    Foreach(string key in list)

    {

           dictionary[key].value = transform();

    }

     

    You can try it out and see if it is faster.

    Wednesday, April 23, 2008 1:07 PM
  • @Philip: But you cannot change values while iterating in a foreach loop. Doing so invalidates the iterator - although, as stated, there does not seem to be an obvious reason for this.
    Wednesday, April 23, 2008 1:12 PM
  • This is interesting problem. But your problem have two part. First one is Dictionary, but the more important is foreach. If you want to alter collection in a foreach loop, then this loop type is not the best one. For example you can't remove items while looping through collection with foreach. For example you can't use this code just because you can't edit collection in foreach loop:

    foreach(string key in dictionary.Keys)

    {

        if (condition(dictionary[key])

             dictionary[key] = transform(dictionary[key]);

    }

     

    If is posible to loop trough your dictionary using for loop than you can change values. In your case key is string type, so if you can evaluate key value in for loop you will be fine, otherwise you have to change to some other type of collection.

    Wednesday, April 23, 2008 1:21 PM
  •  boban.s wrote:

    This is interesting problem. But your problem have two part. First one is Dictionary, but the more important is foreach. If you want to alter collection in a foreach loop, then this loop type is not the best one. For example you can't remove items while looping through collection with foreach. For example you can't use this code just because you can't edit collection in foreach loop:

    foreach(string key in dictionary.Keys)

    {

        if (condition(dictionary[key])

             dictionary[key] = transform(dictionary[key]);

    }

     

    If is posible to loop trough your dictionary using for loop than you can change values. In your case key is string type, so if you can evaluate key value in for loop you will be fine, otherwise you have to change to some other type of collection.

     

    Exactly.  Never, never alter the items during a foreach loop.  Use the data, but do not alter it.  Use the Count property to write a for loop instead.

     

    Rudedog

    Wednesday, April 23, 2008 1:31 PM
  • In other words, I will resort to my 3rd solution from the initial post:

    ICollection<string> keys = new List<string>(dictionary.Keys);
    foreach (string k in keys)
    {
    String val = dictionary[k];
         if (condition(val))
         dictionary[k] = transform(val);
    }

    While I can understand that collections need to remain unchanged while iterating, for a dictionary changing the value, not the key should really not invalidate the iterator.
    Wednesday, April 23, 2008 1:41 PM
  • Don't do:

     J.Vollmering wrote:
    In other words, I will resort to my 3rd solution from the initial post:

    ICollection keys = new List(dictionary.Keys);
    foreach (string k in keys)
    {
    String val = dictionary[k];
         if (condition(val))
         dictionary[k] = transform(val);
    }


     

    But do:

     

    Code Snippet
    foreach (string k in dictionary.Keys)
    {
    String val = dictionary[k];
         if (condition(val))
         dictionary[k] = transform(val);
    }

     

     

    It should work.
    A foreach is generic in its approach that the element that you iterate upon cannot be altered. But for a dictionary this limitation is pointless, unless you add and or delete, but I am talking about altering only.
    As long as you don't change the key, the actual iterator is not invalidated.
     
    Wednesday, April 23, 2008 2:00 PM
  • It does not work, throws an exception "Collection was modified; enumeration operation may not execute.". Only copying the keys into a separate list seems to work ok. This is what seems wrong to me.
    Wednesday, April 23, 2008 2:17 PM
  •  J.Vollmering wrote:
    It does not work, throws an exception "Collection was modified; enumeration operation may not execute.". Only copying the keys into a separate list seems to work ok. This is what seems wrong to me.

    It cannot permit the collection to be modified in any way.  There's just no way around that limitation.  Just create a for loop using the Count property of the dictionary.

     

    for( i = 0; i < dictionary.Count; i++)

     

    Rudedog

     

    Ooops,  Use this collection.  Forgot to add.

     

    ublic class NameValueCollection : System.Collections.Specialized.NameObjectCollectionBase

    Member of System.Collections.Specialized

    Summary:

    Represents a collection of associated System.String keys and System.String values that can be accessed either with the key or with the index.

    Wednesday, April 23, 2008 2:43 PM
  • Yes, you are right.

    I just spent the last hour trying for a workaround.

    I even tried this:

     

    Code Snippet

    SortedList<string, string> dic = new SortedList<string, string>();

    dic.Add("1", "11");

    dic.Add("2", "11");

    dic.Add("3", "11");

    for (int i = 0; i < dic.Count; i++)

    {

    dic.Values[i] = "DD";

    }

     

     

    But also then I am getting an exception. This is in my opinion totally illogical because by doing these modifications you are not changing the internal hash, because that should be based on the key alone and not the value.

     

    So I agree with you, there is no other way than to copy.

    To my suprise I must admit, I guess we both are.

    I thought only the keys would have been protected from changes.

    My reasoning was that if you do a foreach on a dictionary you get both the key and value, so you would not be able to change any of them because there is a protection on the enumerator.

    So then I thought iterating on the keys only would help. It seems the framework is trying to be a little smarter than it should in this case because conceptually you are not doing anything wrong.

     

    I'd really like one of the Microsoft guys to provide an explanation here, I hope one of them picks this up.

    Wednesday, April 23, 2008 2:50 PM
  • If you allow modifications, what modifications do you allow, and how do you detect them?  I do believe that if you modify the collection, then the memory pointers to the actual content changes.

    Wednesday, April 23, 2008 2:53 PM
  •  Rudedog2 wrote:

    If you allow modifications, what modifications do you allow, and how do you detect them?

     

    When using a dictionary the only thing that accounts for the internal ordering mechanism is the key.

    So as long as the key is not altered conceptually there is nothing wrong in changing the value.

    Apparently this does not work here, and I guess I'll have to accept this, but I don't agree. I wouldn't go as far as to call it a bug though.

     

    Fortunately the main use of a dictionary is for lookup purposes. I never used it to change the values inside it via iteration before.

     

    The poster does have a point that he is going to deal with a small performance penalty by copying the keys into another collection.

    Wednesday, April 23, 2008 3:03 PM
  • If the collection gets modified it renders the enumerator object meaningless.  That object is what points to the data in memory. 

     

    public interface IEnumerator<T>

    Member of System.Collections.Generic

    Summary:

    Supports a simple iteration over a generic collection.

     

     

    Rudedog

     

    EDIT

    My answer is too cryptic.  Take a look at this link.

    http://msdn2.microsoft.com/en-us/library/system.collections.ienumerable.getenumerator.aspx

    Wednesday, April 23, 2008 3:09 PM
  • you are referring to this sentance:

     

    An enumerator remains valid as long as the collection remains unchanged. If changes are made to the collection, such as adding, modifying, or deleting elements, the enumerator is irrecoverably invalidated and its behavior is undefined.

     

    I understand this.

     

    Butttt, my point is, that in reality no matter how microsoft is doing things internally, I am 100% sure that actually by changing the value of a key-value combination in a dictionary, there is no reason to invalidate an enumerator.

     

    If there is such a limitation, and clearly there is, I am not agueing about it, then it is there because of how MS has implemented it.

     

    A dictionary is based on hashing, a hash is calculated on a key, the key is an index into an internal array. The enumerator is really enumerating through the array.

     

    So I am still curious why MS has foreseen this limitation.

    Wednesday, April 23, 2008 3:59 PM
  •  Rudedog2 wrote:

    If you allow modifications, what modifications do you allow, and how do you detect them? 

    I

     

    So what do you allow without slowing down the runtime?

    Wednesday, April 23, 2008 4:05 PM
  • When you iterate over the dictionary using foreach, the enumeration of KeyValuePair is generated on the fly - there is no list of KeyValuePair stored in the dictionary.

     

    So even if KeyValuePair weren't readonly, it wouldn't do you any good, as you wouldn't be updating the dictionary.

     

    Regarding being able to modify values while enumerating the keys, I tend to agree that it ought to be harmless.  However the code that updates a dictionary value increments a version field, which causes the subsequent InvalidOperationException from the enumerator.  I reckon this increment is probably unnecessary - you could try submitting a bug on Connect to see if Microsoft agree.

     

    Code Snippet

    private void Insert(TKey key, TValue value, bool add)

    {

        ...

        for (int i = this.buckets[index]; i >= 0; i = this.entries[i].next)

        {

            if ((this.entries[i].hashCode == num) && this.comparer.Equals(this.entries[i].key, key))

            {

                if (add)

                {

                    ThrowHelper.ThrowArgumentException(ExceptionResource.Argument_AddingDuplicate);

                }

                this.entries[i].value = value;

                this.version++;

                return;

            }

        }

    }

     

     

    So your original solution looks like the best one.  An alternative if you expect relatively few entries to match the condition for transforming would be to save the items you want to update in a temporary collection, e.g.:

     

    Code Snippet

    List<KeyValuePair<string, string>> itemsToTransform = new List<KeyValuePair<string, string>>();

    foreach (KeyValuePair<string, string> item in dictionary)

    {

        if (condition(item.Value)) itemsToTransform.Add(item);

    }

    foreach (KeyValuePair<string, string> item in itemsToTransform)

    {

        dictionary[item.Key] = item.Value.ToUpper();

    }

     

     

     

    Wednesday, April 23, 2008 5:51 PM
  • JocularJoe,

    Take a look at the NameValueCollection in the System.Collections.Specialized namespace....

     

    public class NameValueCollection : System.Collections.Specialized.NameObjectCollectionBase

    Member of System.Collections.Specialized

    Summary:

    Represents a collection of associated System.String keys and System.String values that can be accessed either with the key or with the index.

     

    ...when you get a chance.  Instead of doing this,

     

    Code Snippet

    List<KeyValuePair<string, string>>

     

     

    you have a ready made collection.

     

    Rudeodg

    Thursday, April 24, 2008 10:43 PM
  • You can do this by having the dictionary store references instead of values.  Here's an example:

        private class DictValue {
          public DictValue(int v) { iValue = v; }
          public int iValue;
        }

        private Dictionary<string, DictValue> mDict = new Dictionary<string, DictValue>();

        private void Test() {
          mDict.Add("zero", new DictValue(0));
          mDict.Add("one", new DictValue(1));
          foreach (KeyValuePair<string, DictValue> v in mDict)
            v.Value.iValue += 1;
          foreach (KeyValuePair<string, DictValue> v in mDict)
            Console.WriteLine("{0} = {1}", v.Key, v.Value.iValue);
        }

    Output:
    zero = 1
    one = 2
    Friday, April 25, 2008 12:44 PM
  • nobugz,

    That's a good concept and technique to point out. 

    I suggested the NameValueCollection because the OP wants to iterate over the loop, and modify some values while doing so.  But, you cannot modify a collection inside of a foreach block.  I think the "best" current solution has been to copy to an array and iterate over that using a for loop.

    Rudedog
    Friday, April 25, 2008 1:35 PM
  • The OP barked at the notion of having to allocate an array.  I agree.  The sample code I posted doesn't change the collection, doesn't invalidate the Dictionary and doesn't require extra memory beyond the 4 bytes object reference.  Let's leave it up to the OP to decide what's best.  Plenty to choose from.
    • Proposed as answer by Peter Makayed Friday, May 18, 2018 1:29 PM
    Friday, April 25, 2008 3:31 PM
  • I guess I have a simpler answer for this one. 

    var dictKeys = dict.Keys.ToArray();
    
    foreach(var key in dictKeys)
    {
       dict[key] = newValue;
    }
    Hope this helps.

    Monday, May 30, 2016 9:51 PM
  • https://stackoverflow.com/a/1167089/7149454
    Friday, November 30, 2018 10:04 AM