LinkageWiz Data Matching Software

Home
Features
Applications
Clients
Specifications
Services
Free Trial
FAQ
Purchase
Publications
Contact Us

LinkageWiz Specifications

Maximum number of records per table

Approximately 4-5 million records (Enterprise version required).

Files with more than 4-5 million records can be processed using a special Chunking methodology.

Maximum Database Size: 2 Gigabytes.

 


System Requirements

Minimum:

PC with 2Gb RAM, 5Gb free disk space.

Recommended:

PC with 4Gb RAM, 10Gb free disk space..

Operating System:

Windows 2000, XP, 2003 Server, Vista, 2008 Server 32 & 64 Bit, Windows 7, WIndows 8, Mac running MS Windows, Windows 10

 


Data Requirements

One or more data files containing several patient identification fields. These can be configured in the Define Match Variables screen.

Supported file formats include text (fixed and delimited), MS Excel, MS Access, FoxPro and DBASE.


Languages

LinkageWiz is available in English and French language versions. Names from non-English speaking countries (e.g. Eastern Europe, the Middle East, Asia)
will be processed correctly if they have been stored using the roman character set (e.g. A-Z).


Performance & Linkage Quality

An assessment of LinkageWiz's Performance and Linkage Quality was recently undertaken by the Australian Centre for Data Linkage (CDL):

Type of Linkage 

Speed

Linkage Quality*

 

Runtime

Precision

Recall

F-measure

 

 

 

 

 

Deduplication of 40,000 records

1 minutes

0.98

0.79

0.88

Deduplication of 400,000 records

20 minutes

0.95

0.76

0.85

Linkage of 400,000 records to 4 million records

1.5 hours

0.96

0.79

0.87

PC specifications: 1.6 GHz processor, 4 Gb RAM, Intel Core i5, Windows 7


Precision, Recall and F-measure are measures used to assess linkage quality (Christen & Goiser, 2007):

  • Precision refers to the proportion of returned matches that are true matches (sometimes referred to as positive predictive value).

  • Recall is the proportion of all true matches that have been correctly identified (Recall is also known as sensitivity).

  • F-measure has a high value when both precision and recall have high values; however, there is an underlying trade-off between
    precision and recall (when one is high, the other is invariably lower). The f-measure is seen as a way of finding the best compromise
    between these two metrics.


Comparison with Other Data Matching Software

The Australian Centre for Data Linkage (CDL) based at Curtin University recently evaluated ten data matching software packages commonly
used in Population Health Research with a view to assessing functionality, speed and linkage quality:

  • IBM QualityStage (Tier 1 product)

  • Dataflux dfPowerStudio (Tier 1 product)

  • LinkageWiz

  • FEBRL

  • HDI

  • The Link King

  • LINKS

  • Big Match

  • Program based on Scottish Linkage System

  • FRIL

 The evaluation found that LinkageWiz achieved a high matching accuracy and ranked 3rd overall:

General features – 4/5

Pre-processing capabilities – 4/5

Record linkage methodology – 4/5

Post-linkage functions – 4/5

Click here for a detailed evaluation/ checklist of LinkageWiz's functionality undertaken by the CDL during the evaluation.

Processing times may vary according to the data set characteristics, the number of linkage fields used and the computer specifications.