Parallel Generalized Hebbian Algorithm for Large Scale Data Analytics
Main Article Content
Abstract
In order to store and analyse large amounts of data on a parallel cluster, Big Data Systems such as Hadoop and DBMSs require a complex configuration and tuning procedure. This is mostly the result of static partitioning occurring whenever data sets are imported into the file system or transferred into it. Following that, parallel processing is carried out in a distributed fashion, with the objective of achieving balanced parallel execution among nodes. The system is notoriously difficult to configure, particularly in the areas of node synchronisation, data redistribution, and distributed caching in main memory. The extended Hebbian algorithm, abbreviated as GHA, is a linear feedforward neural network model for unsupervised learning that finds the majority of its applications in principle components analysis. Sanger's rule is another name for the GHA that may be found in the academic literature. Its formulation and stability, with the additional feature that it may be used to networks that have more than one output. A unique hardware architecture for principal component analysis is presented here in the form of a paper. The Generalized Hebbian Algorithm (GHA) was chosen as the foundation for the design because to the fact that it is both straightforward and efficient. The architecture may be broken down into three distinct parts: the memory unit, the weight vector updating unit, and the primary computing unit. Within the weight vector updating unit, the computation of various synaptic weight vectors uses the same circuit in order to cut down on the area expenses. This is done in order to save space. The GHA architecture incorporates a versatile multi-computer framework that is based on mpi. Therefore, GHA may be efficiently executed on platforms that utilise either sequential processing or parallel processing. When the data set is studied for a short period of time or when a dynamic number of virtual processors is selected at runtime, we predict that our architecture will be able to profit from parallel processing on the cloud. In this research, a parallel implementation of a variety of machine learning algorithms that are built on top of the MapReduce paradigm is presented with the purpose of improving processing speed and saving time.
Downloads
Article Details
This work is licensed under a Creative Commons Attribution 4.0 International License.