[1] Elmagarmid A K, Ipeirotis P G, Verykios V S. Duplicate record detection: a survey [J]. IEEE Transactions on Knowledge and Data Engineering, 2007, 19(1): 1-16.
[2] Christen P. Data matching: concepts and techniques for record linkage, entity resolution, and duplicate detection [M]. Berlin: Springer Science and Business Media, 2012.
[3] Christen P. Automatic record linkage using seeded nearest neighbour and support vector machine classification [C]//Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2008: 151-159.
[4] Bhattacharya I, Getoor L. Collective entity resolution in relational data [J]. ACM Transactions on Knowledge Discovery from Data (TKDD), 2007, 1(1): 5.
[5] Li P, Dong X, Maurino A, et al. Linking temporal records [J]. Proceedings of the VLDB Endowment, 2011, 4(11): 956-967.
[6] Cohen W W. Integration of heterogeneous databases without common domains using queries based on textual similarity [C]//ACM SIGMOD Record. 1998: 201-212.
[7] Vandic D, Van Dam J W, Frasincar F. Faceted product search powered by the Semantic Web [J]. Decision Support Systems, 2012, 53(3): 425-437.
[8] Dunn H L. Record linkage [J]. American Journal of Public Health and the Nations Health, 1946, 36(12): 1412-1416.
[9] Newcombe H B, Kennedy J M, Axford S J, et al. Automatic linkage of vital records computers can be used to extract “follow-up” statistics of families from files of routine records [J]. Science, 1959, 130(3381): 954-959.
[10] Fellegi I P, Sunter A B. A theory for record linkage [J]. Journal of the American Statistical Association, 1969, 64(328): 1183-1210.
[11] Ukkonen E. Approximate string-matching with q-grams and maximal matches [J]. Theoretical Computer Science, 1992, 92(1): 191-211.
[12] Broder A Z, Charikar M, Frieze A M, et al. Min-wise independent permutations [J]. Journal of Computer and System Sciences, 2000, 60(3): 630-659.
[13] McCallum A, Nigam K, Ungar L H. Efficient clustering of high-dimensional data sets with application to reference matching [C]//Proceedings of the Sixth ACM SIGKDD International
Conference on Knowledge Discovery and Data Mining. 2000: 169-178. |