To process the collection in a parallel manner, we can use the AsParallel() extension method followed by any of the existing LINQ extension methods. In many cases, parallel execution means that the query runs significantly faster. It does this by partitioning the data source into segments, and then executing the query on each segment on separate worker threads in parallel on multiple processors. The primary difference is that PLINQ attempts to make full use of all the processors on the system. PLINQ queries, just like sequential LINQ queries, operate on any in-memory IEnumerable or IEnumerable data source, and have deferred execution, which means they do not begin executing until the query is enumerated. If you're already familiar with LINQ, I got good news for you because you'll immediately feel right at home.Ī PLINQ query in many ways resembles a non-parallel LINQ to Objects query. This ensures that the code remains readable while writing more complex business logic, where you need to order, filter, or transform the data. PLINQ brings parallelism to the well-known LINQ API. While the Parallel solution works fine for my use case. In our case, this "refactor" resulted that it now only takes 40 seconds to process 1.000 items.įor the whole collection of 60.000 items, it takes 40 minutes.įrom 30 hours to 40 minutes with just a few lines of extra code!īecause we're using the number of processors of the machine, it takes 20% longer on my machine compared to the server.īut it doesn't stop here. The simplified version of the initial code looked like this.Ī simple iteration over a list, and within the loop, the migration of an item where we:įoreach ( var itemId in itemsFromSystemA ) ) That's a lot of time, especially if you know that it isn't a difficult task to migrate an item. It took 30 minutes to process 1.000 items, which makes it 30 hours in total to process all of the items in the collection. Initial codeįor our case, we had 60.000 items that had to be migrated from one system to another system. In this article, we'll take a look at the different ways to process a collection faster. In data parallel operations, the source collection is partitioned so that multiple threads can operate on different segments concurrently. It's also safe to say that the chance of bugs within this code is far less in comparison with a custom implementation.ĭata parallelism refers to scenarios in which the same operation is performed concurrently (that is, in parallel) on elements in a source collection or array. Luckily, C# hides all of the implementation details for us with the Task Parallel Library (TPL) introduced in. It even gets more complex when you spawn multiple threads to achieve the best performance. If you take a look at the thread docs you can see that it takes some "orchestration code" to manage these threads.īecause we write the code on your own, there's also a probability that this code contains bugs. NET, spawning new threads was manual work and required some knowledge. When you want to work through a collection faster, a common solution is to divide the work among threads that run concurrently. The same applies to programming, where these friends are called threads. But with a couple of friends, the same amount of work can be completed in an hour or less, depending on the number of friends that are helping and how hard everyone is working. When you have a chore that takes a long time to complete then it's probably going to be faster if you can divide the amount of work.įor example, if you paint a room on your own, this job might take a couple of hours.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |