Big Data is a noteworthy environment to maintain the diversity of the huge amount of data. The big data utilizes machine learning algorithms to process large datasets which comes from various places such as histories, weblogs, and data repositories, large datasets and data warehousing, etc. In an existing method, most of the data mining approaches might not be able to maintain the large dataset. Using datamining, the big data are having lack of compatibility with database systems and analysis tools; large dataset clustering and analyzing is a big issue in big data. For this reason, the research work uses machine learning algorithms which are implemented in the Hadoop tool to collect and process the large amount of data which is structured, semi-structured or unstructured in a reasonable amount of time. Also, it gives more accurate prediction system and accurate information. Using Machine Learning Algorithm computational cost and complexities is minimized. The overall research work is implemented in the Hadoop tool with the help of the python programming language and it is compared with some existing algorithms. The proposed work tested with suitable parameters such as accuracy, Kappa T and Kappa M.