International Journal of Advances in Electrical Engineering
  • Printed Journal
  • Refereed Journal
  • Peer Reviewed Journal

P-ISSN: 2708-4574, E-ISSN: 2708-4582

International Journal of Advances in Electrical Engineering

2023, Vol. 4, Issue 1, Part A
De-duplication avoidance in regional names using an approach based on pronunciation

Author(s): Nagesh Raykar, Dr. Prashant Kumbharkar and Dr. Dand Hiren Jayatilal

Abstract: Demographic data deduplication has occurred in every field, including government, marketing, and opinion research, particularly if you work in IT and are in charge of taking backups or transferring large amounts of data. Duplication occurs both directly and indirectly when copying the same backup. As a result, there is an inherent need to proceed or remove redundant data. The term "de-duplication" refers to the removal of duplicate data. This is required for better data storage utilization. The deduplication process involves removing the duplicate copy and keeping only one copy. Deduplication includes a de-duplication process. A different user stores the same file in the same location. As a result, it increases redundancy. Many scholars have already did work on demographic data de - duplication, and one such requirement is that a specific reduction rule is useful for the deduplication algorithm in Indian demographic data. Based on the pronunciation rule, the researchers will evaluate the regional name, first name, and last name. It is necessary to test with various phonetic-based algorithms and then develop an efficient new phonetic-based algorithm. The phonetic algorithm is responsible for indexing words based on their own phonetics. The majority of phonetic algorithms have been primarily designed for English language. Demographic Information provides data on individuals based on features such as First name, Surname, age, gender, contact no, email id, and so on. Considering the Indian regional languages names scenario, we must identify an individual who has the same name but different spellings. The proposed study compares traditional regional names in the format of First name and Surname using the pronunciation rule. For the local languages, a prototype effective phonetic-based algorithm has been developed. An effort has been made to avoid redundant information in the names, and secondly, equivalent names, even with different alphabetical arrangements, have been identified in order to locate an individual in e-governance of a region or any industry. The proposed approach's findings are encouraging, and it can be used in a real - world environment.

DOI: 10.22271/27084574.2023.v4.i1a.32

Pages: 10-17 | Views: 1740 | Downloads: 961

Download Full Article: Click Here

International Journal of Advances in Electrical Engineering
How to cite this article:
Nagesh Raykar, Dr. Prashant Kumbharkar, Dr. Dand Hiren Jayatilal. De-duplication avoidance in regional names using an approach based on pronunciation. Int J Adv Electr Eng 2023;4(1):10-17. DOI: 10.22271/27084574.2023.v4.i1a.32
International Journal of Advances in Electrical Engineering
Call for book chapter