NVESTIGATION OF THE EFFICIENCY OF DISTRIBUTED INFORMATION
SYSTEMS BASED ON THE PROCESSING OF LARGE AMOUNTS OF DATA
In this article investigated the features of processing large arrays of information for distributed systems. A method of singular data decomposition is used to reduce the amount of data processed, eliminating redundancy. Dependencies of computational efficiency on distributed systems were obtained using the MPI messaging protocol and MapReduce node interaction software model. Were analyzed the efficiency of the application of each technology for the processing of different sizes of data: Non — distributed systems are inefficient for large volumes of information due to low computing performance. It is proposed to use distributed systems that use the method of singular data decomposition, which will reduce the amount of information processed. The study of systems using the MPI protocol and MapReduce model obtained the dependence of the duration calculations time on the number of processes, which testify to the expediency of using distributed computing when processing large data sets. It is also found that distributed systems using MapReduce model work much more efficiently than MPI, especially with large amounts of data. MPI makes it possible to perform calculations more efficiently for small amounts of information. When increased the data sets, advisable to use the Map Reduce model.