Peacock Data talks about fuzzy logic
January 21, 2014 (PRLEAP.COM) Technology NewsJanuary 21, 2014 - Barbara Adair from Peacock Data sat down for an interview earlier this month to discuss the fuzzy logic technology integrated in their new pdNickname 2.0 Pro and pdGender 2.0 Pro products. She is the company's chief development coordinator and has been with the firm since 2012.
pdNickname is an advanced name and nickname database. pdGender is a gender coding file built on the same set of names. The Pro edition of these products for the first time provide fuzzy logic allowing users to match information against their lists even when there are typographical errors or stylized spellings.
According to Barbara, "Previous products, such as our pdGeoTIGER geo-coder, included fuzzy logic, but our newest technology is the most highly developed to date."
"There are two kinds of fuzzy logic incorporated," she said. "One is designed to pick up common typographical errors and the other works with stylized spellings and letters not on a regular English keyboard."
Barbara pointed out, "The most complex fuzzy logic involves predicting likely misspellings or alterations. We look at numerous factors that may occur in the spelling of a name. Common examples are frequently reversed digraphs (a pair of letters used to make one phoneme or distinct sound), double letters that are often typed as single letters, non-common characters, the number of letters in a name, where elements occur in a name, and hundreds of other possible factors."
"A lot of research and field trials have gone into creating the fuzzy logic algorithms and their inclusion in our new products will substantially increase their power for users," she added.
Barbara illustrated several examples from the actual databases used in the products. The first was "Sophia" which has the digraph "ph" in the center. One of the most common typos is "Sohpia" with the digraph reversed as "hp". The new databases pick up both spellings. Another example is "Rocco" typed as "Roco" with one "c".
Not all the examples involve typographical errors. Some concern stylized spellings and special letters such as the umlaut in the middle of "Björk". In this case the products will find the name with the umlaut as well as typed "Bjork" without the special character.
"The difference between a real name and a fuzzy version can be very slight and even difficult to notice at first glance," Barbara said. "But they are different and can make a big difference in the success rate for businesses and organizations working with lists of names."
Barbara notes, "A sizable majority of the Pro edition of both new products are built with fuzzy logic, but users not ready to dive into the new technology can purchase a Standard edition without fuzzy logic and easily add it later when they are ready."
Starting in March the company will release fuzzy logic add-on packs every month providing even greater capabilities. Choices will be available to users and the packs are compatible with both the Pro and Standard versions of pdNickname 2.0 and pdGender 2.0.
"The new fuzzy logic technology will also be integrated in other products already in our line as well as in new products currently in development," Barbara concluded. "It is very exciting for us and our end users."
About Peacock Data
Peacock Data are the makers of database products used by business, organizations, churches, schools, researchers, and government.
Their flagship offerings include: pdNickname, a highly-regarded name and nickname product recently upgraded to version 2.0; pdGender, a gender coding database also recently upgraded to version 2.0; pdGeoTIGER, a precision ZIP+4 and address range GeoCoding package; pdCensus2010, with demographic data drawn from 2010 American census tabulations, and pdACS2013, unveiled last May, another demographics offering providing American Community Survey (ACS) estimates gathered from the U.S. Census Bureau and summarized at over 100 stratification levels.
Peacock Data is a California-based company in business since 2003.
PRODUCT RELEASE SCHEDULE:
March: first fuzzy logic add-on packs for pdNickname and pdGender (to be released monthly).
April: pdGeoTIGER 2.0, pdCensus2010 1.0.
May: pdLatino 1.0, pdACS2014 1.0.
June: pdZIP 2.0, pdZIP+4 1.0.
July: pdCensus2000 2.0.
August: pdSurname 1.0.
September: pdProperCase 1.0.
October: pdCountry 2.0.
November: pdGeoTIGER 2.1, pdCensus2010 1.1.