CFP last date
20 May 2024
Reseach Article

Data Duplication Tactics with Hadoop

by Waqas Ahmad, Hongwei Xie, Ammad Khan, Mubashir Tariq
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 177 - Number 15
Year of Publication: 2019
Authors: Waqas Ahmad, Hongwei Xie, Ammad Khan, Mubashir Tariq

Waqas Ahmad, Hongwei Xie, Ammad Khan, Mubashir Tariq . Data Duplication Tactics with Hadoop. International Journal of Computer Applications. 177, 15 ( Nov 2019), 6-12. DOI=10.5120/ijca2019919548

@article{ 10.5120/ijca2019919548,
author = { Waqas Ahmad, Hongwei Xie, Ammad Khan, Mubashir Tariq },
title = { Data Duplication Tactics with Hadoop },
journal = { International Journal of Computer Applications },
issue_date = { Nov 2019 },
volume = { 177 },
number = { 15 },
month = { Nov },
year = { 2019 },
issn = { 0975-8887 },
pages = { 6-12 },
numpages = {9},
url = { },
doi = { 10.5120/ijca2019919548 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
%0 Journal Article
%1 2024-02-07T00:45:57.266764+05:30
%A Waqas Ahmad
%A Hongwei Xie
%A Ammad Khan
%A Mubashir Tariq
%T Data Duplication Tactics with Hadoop
%J International Journal of Computer Applications
%@ 0975-8887
%V 177
%N 15
%P 6-12
%D 2019
%I Foundation of Computer Science (FCS), NY, USA

Hadoop Distributed File System (HDFS) allotment of Apache Hadoop helps in conveyed accommodation of huge devices with an accumulation of account equipment. HDFS guarantees accessibility of advice by accompanying advice to assorted hubs. Be that as it may, the archetype action of HDFS does not anticipate about the ballyhoo of information. The prevalence of the abstracts trend to change afterwards some time. Thus, befitting up a acclimatized archetype agency will access the accommodation capability of HDFS. In this cardboard we adduce an accomplished activating advice archetype administering framework, which accede the beyond of abstracts put abroad in HDFS afore replication. This alignment effectively characterizes the anal to hot advice or air-conditioned advice in appearance of its bulge and builds the reproduction of hot advice by applying abolishment coding for icy information. The balloon comes about authenticate that the proposed address viably decreases the accommodation acceptance up to 40% after influencing the accessibility and adjustment to centralized abortion in HDFS.

  1. Shvachko, K., Kuang, H., Radia, S., & Chansler, R. (2010, May). The hadoop distributed file system. In Mass storage systems and technologies (MSST), 2010 IEEE 26th symposium on (pp. 1-10). IEEE.
  2. Wei, Q., Veeravalli, B., Gong, B., Zeng, L., & Feng, D. (2010, September). CDRM: A cost-effective dynamic replication management scheme for cloud storage cluster. In Cluster Computing (CLUSTER) 2010 IEEE International Conference on (pp. 188-196). IEEE.
  3. Ananthanarayanan, G., Agarwal, S., Kandula, S., Greenberg, A., Stoica, I., Harlan, D., & Harris, E. (2011, April). Scarlett: coping with skewed content popularity in map reduce clusters. In Proceedings of the sixth conference on Computer systems (pp. 287-300). ACM
  4. Abad, C. L., Lu, Y., & Campbell, R. H. (2011, September). DARE: Adaptive data replication for efficient cluster scheduling. In Cluster Computing (CLUSTER), 2011 IEEE International Conference on (pp. 159-168). IEEE
  5. Kaushik, R. T., Abdelzaher, T., Egashira, R., & Nahrstedt, K. (2011, July). Predictive data and energy management in Green HDFS. In Green Computing Conference and Workshops (IGCC), 2011 International (pp. 1-9). IEEE.
  6. “Bsoul, M., Al-Khasawneh, A., Abdullah, E. E., & Kilani, Y. (2011). Enhanced fast spread replication strategy for data grid. Journal of Network and Computer Applications, 34(2), 575- 580.
  7. Cheng, Z., Luan, Z., Meng, Y., Xu, Y., Qian, D., Roy, A., & Guan, G. (2012, September). Erms: An elastic replication management system for hdfs. In Cluster Computing Workshops (CLUSTER WORKSHOPS), 2012 IEEE International Conference on (pp. 32-40). IEEE.
  8. Kousiouris, G., Vafiadis, G., & Varvarigou, T. (2013, October). Enabling proactive data management in virtualized hadoop clusters based on predicted data activity patterns. In P2P, Parallel, Grid, Cloud and Internet Computing (3PGCIC), 2013 Eighth International Conference on (pp. 1-8). IEEE.
  9. Papoulis, A. (1977). Signal analysis (Vol. 191). New York: McGraw-Hill.
  10. Bui, D. M., Hussain, S., Huh, E. N., & Lee, S. (2016). Adaptive Replication Management in HDFS based on Supervised Learning. IEEE Transactions on Knowledge and Data Engineering, 28(6), 1369-1382.
  11. Qu, K., Meng, L., & Yang, Y. (2016, August). A dynamic replica strategy based on Markov model for hadoop distributed file system (HDFS). In Cloud Computing and Intelligence Systems (CCIS), 2016 4th International Conference on (pp. 337-342). IEEE.
  12. Reed, Irving S.; Solomon, Gustave (1960), Polynomial Codes over Certain Finite Fields, Journal of the Society for Industrial and Applied Mathematics (SIAM), 8 (2): 300–304, doi:10.1137/0108018
  13. J. Dean and S. Ghemawat, “Map Reduce: simplified data processing on large clusters,” in Proceedings of the 6th conference on Symposium on Operating Systems Design & Implementation (OSDI ’04), pp. 137–149, San Francisco, Calif, USA, 2004.
Index Terms

Computer Science
Information Sciences


Big Data Hadoop Distributed File System Dynamic data replication