Hi,
If I understand the scenario correctly, I don't think you need MPI for this kind of scenario. Please let me know if I'm wrong.
You basically have a dataset with 100K records and you need machine 1 to process the first 50K records and machine 2 to process the second 50K records? I assume that this 100K record is sitting in a database somewhere. Each machine should be able to independently
query the database to grab the desired records.