-
Notifications
You must be signed in to change notification settings - Fork 3
Log Cross Check Phase
Cross-log checking: Using the normalized logs, perform cross-log checking. For each QSO entry in a log, decide if it's confirmed, partially confirmed (for half credit), a NIL, a busted call, etc.
The determination of whether a Q is a DUPE or not happens during the report phase. Duplicate check has nothing what-so-ever to do with Log Cross-checking. The job of Log Cross Check is to determine the score (validity) of each individual QSO.
During the Reporting Phase, the highest scoring QSO between 2 stations for that Band/Mode is counted. It is completely possible for Station A to get full credit for a QSO with Station B even if B did not receive the correct information from A for that QSO. Each QSO is judged from the perspective of the receiving log.
If there are multiple QSO's between stations on the same Band/Mode, highest scoring Q will "count" with the receiving station - that might be the 1st, 2nd, 3rd or even 4th QSO between the 2 stations. So if A and B QSO each other twice, and A gets B's information right on the first QSO, but B botches the received info. A gets full credit and B gets less than that. If later in the contest, if B contacts A again and receives A's info right, B will get full credit that QSO. QSO's are judged by halves. If you received what the other guy sent, you are "ok".
Log Cross-checking has two major phases:
(1) automated computer cross checking and
(2) human assisted final checking.
The automated program can correctly score ~99.5% of the QSO’s, however ~0.5% of the QSO’s require a human to make the final decision.
The humans use a GUI tool with capabilities to open multiple logs for comparison, and to sort, search and generate a reverse log.
The two main problem categories of problems that humans deal with are:
(1)NIL’s and
(2)Band/Mode errors.
NIL is the ultimate penalty and although the program is almost always correct, human approval is required. CQP scoring rules allow partial credit for a busted callsign and sometimes a human can see a plausible way to give partial credit instead of a NIL. The number of NIL’s is very small.
The major problem is the Band/Mode error (there is a Q between two stations but either the logged Band or Mode doesn’t agree in both logs. Resolving this situation usually requires looking in multiple logs (not just the two logs in question). Even a top operator using modern logging programs with computerized rig control will make logging mistakes. Essentially 100% of the top logs will have this Band/Mode logging problem to one degree or another – it is that common! Accurate Band/Mode information affects the score because there are 16 Band/Mode combinations (160m,80,40,20,15,10,6,2,PH or CW) and contacting a station on all them is allowed for separate credit. Often correcting the Band/Mode is the difference between a full credit QSO and essentially a NIL (a dupe).
Green (the log cross checking software) requires 2 Databases to do its job: (1) Calls DB and (2) QRZ DB.
The initial Calls DB is program generated and contains all US/VE calls heard more than 2x and all DX callsigns no matter the frequency of occurrence. The DB format is a histogram (callsign, #times heard). The next step is a human review process by our most experienced CQP log-check team members. Bad callsigns are removed from this DB manually. For example AB6E is probably a bust of AD6E even though AB6E is a “valid callsign” in the US and it might have been “worked” 6 times. This process can take several days or perhaps a week. This is important because if a call appears in this DB and it is a valid callsign (i.e. licensed) and we don’t have that callsigns log, the submitters log will get a Bye for that QSO instead of some kind of point deduction. Many "busted calls" are "heard" more than once.
Running all “base” callsigns through a QRZ.com query program generates the QRZ file. This includes the US/VE calls that didn’t “make it” into the Calls DB due to low histogram count, but does not include the US 1x1 callsigns. QRZ doesn't know about 1x1 calls, so that is a separate process. The QRZ DB when used by Green will return “unknown, bad or good” for each “heard on the air” callsign. When the humans use the GUI tool, this status is available and the log cross check algorithm also uses it!
There is a special utility to generate the 1x1 list of valid calls. This is done by querying the VEC 1x1 DB website. Unfortunately the site cannot provide the desired list directly. The utility extracts license info for all possible 1x1 calls and then that info is parsed to generate the valid calls during CQP. QRZ does not have info about these special short term calls and they can be re-assigned to different persons during the year. The Calls DB and QRZ DB's are adjusted accordingly.
Usually some pre and post processing is required to handle special prefixes. This is specific by year. For example VE3ABC could be signing VA3ABC on the air and that is fine and legal, but QRZ may not know about VA3ABC, only about VE3ABC. Other countries have done similar things. This sort of situation requires some special year by year coding and filtering. This process usually takes a few days when it is required.
In recent years, manual examination of this QRZ DB is required. That is because QRZ no longer accurately reflects the FCC database for US callsigns. It is possible for a ham to “opt out” of a QRZ listing. You can’t tell the FCC to “de-list” your callsign, but now you can tell QRZ to do that!
There is now a manual process to look at all of the US callsigns that judged to be “BAD” (i.e. illegal callsigns) in QRZ and double check with the FCC DB. Some judgment is required. In CQP2013, 2 folks were on the air although the FCC had revoked their licenses. All of these Q’s were allowed although the submitters were contacting a "pirate" station. The QRZ DB does take into consideration the callsigns that might have been changed during the contest submission phase (maybe the licensee upgraded to a vanity call, etc). Revoked licenses get special scrutiny.
Green assumes that its knowledge of US callsigns is “perfect” and it does make a distinction between US and DX calls. The combination of the two DB’s will generate either (“good” or “bad”) for a US callsign. A DX callsign is more problematic. There is no “bad”, there is only “unknown” or “good”. There is no way to “prove” beyond a doubt that a DX callsign is “bad” although the human review of the QRZ DB file can come close to that.
STEPS for log-check:
Any mobile or multiple sent QTH logs are combined into a single Cabrillo file. This is a simple copy and paste operation. Only a handful of stations submit multiple-sent-QTH logs and as long as these logs are name uniquely (eg. K6AQL_MEND, K6AQL_SIER, etc.) this is not an issue.
Use submitted and normalized files to create both log processing DB’s. There are some manual steps! Run Green (the log check software). Run analysis to determine: a) How many total decisions are required by the humans and in what logs? , b) Which logs generate the most problems in other logs? “Fix” those logs manually starting with the “heavy offenders”. Common problem is: logged SentQTH is not what was sent on the air or the log was submitted as CW only when all of the Q’s are actually PH.
At this stage, we should only be seeing errors that are only apparent when multiple logs are compared against each other. Syntax errors should already have dealt with.
The goal of the Normalization process is:
(a) syntactically correct Cabrillo lines,
(b) apply our DB of “translation tables” for allowed abbreviations for each QTH (and there are literally thousands of aliases in our tables). The result of this “translation process” is the CQP official abbreviations for each QTH.
The Band/Mode problem rears its ugly head. This is the most common serious problem and often requires a lot of human editing. We are looking for systemic errors and there are usually many! Cycle through the Green/Analysis/Fix process until the number of human decisions is “doable”. The number of “turns” is typically about 6-8. Each turn takes a day+ for running, analysis and fixing. With better “Normalization”, we can get this down to say 3-4-5 “turns”. The Green/Analysis/Fix refining process takes 2-3 weeks. Experience has shown that having better quality logs makes the human checking phase run much smoother. There is sometimes a lot of pressure to start human checking prematurely.
Human log check:
A set of .zip files are generated containing the program and year specific DB file. These are distributed via a shared Google folder. In addition a Google spreadsheet is generated that assigns all of the logs to specific checkers who are available for that particular year, typically about 10. The goal of human log check is to reach a final determination for every QSO that the Green program couldn't on its own. The most problematic and time consuming are the Band and Mode logging errors. This process usually takes about 3 weeks. The results from each checker are returned via the shared Google Drive as a single .zip file of the Cabrillo files for each checker. ...probably more detail required in this section...?
Post-Human Checking:
The first step is to verify that all of the Cabrillo files that were distributed actually came back via file comparison utility. Some years, a special fix program has to be written to adjust the returned files for some systemic problem that was encountered by the human checkers. An example could be that we find some log is missing 25% of its Q's and the decision is made to retro-actively remove all of the NIL's with that station. However with enough focus on the Green/Analysis/Fix refining process prior to human checking, the probably of something like this happening is greatly reduced, e.g. this process wasn't needed in 2013.
Green is used to generate all of the .RPT files (what the ARRL calls an LCR (Log Checking Report)). The combination of all of these reports contains all of the scoring information needed for the final .PDF documents as well as the details of each and every point deduction as well as a histogram of all mults worked.
DUPES only become relevant during RPT generation. When there are multiple Q's between 2 stations on the same Band/Mode, the highest scoring Q is counted regardless of which Q came first or last. This is done to each station's highest advantage. It is possible for Station A to get full credit for the first Q with Station B while B gets full credit for the 3rd Q with Station A (presuming that Station B made some kind of mistake during the the 1st and 2nd Q's with Station A).
A utility is run which extracts out only the relevant information from the RPT files for the final scoring reports: Call, final validated CW Q's, final validated PH Q's and final score. This is a .CSV file and can be examined and sorted in a spreadsheet. This .CSV file is what is re-imported into the DB to be combined with the Category information. Log-check doesn't care about Categories of entry, it only cares about validating and scoring logs.
In addition the Time58 reports are generated from the Cabrillo files. There are 2 Time58 reports which are in CSV format. The first contains all who actually made it (either from Out of State or in CA). The report shows not only the time, but also who the 58th Q was with. The second Time58 report shows essentially the same information but also the last 3 Q's that lead up to getting 58. Often something interesting for the final Contest write-up can be gleaned from these reports.
Sometimes there are logs that were known not to be good enough to use in the overall log check vs other logs, but nevertheless some kind of score vs other logs is desired (instead of just reporting their claimed score or instead of just reporting them as a checklog). A special log-check report is run for those logs. There were 2 of these logs in CQP2013. Although the programs take just seconds or at most a few minutes, this takes at least an hour of work to inject something like that into the process.
Post Log-Check RPT(LCR) generation:
It is possible for two stations to have a "razor thin" margin between them. This has happened before. The main focus is on the overall winners: #1 vs #2 in CA and anybody who is on the margins of winning/not winning wine. The difference between say #6 and #7 in CA wouldn't receive the same attention.
When something like this happens (and it has between #1 and #2 in CA), it takes a few weeks to figure it out. The margin can some down to 1 QSO or say "if 2 PH Q's had been CW" then Station Y wins". CQP has a process that can accurately judge something like that with huge credibility, but it takes time to do it. In this case, 4 Senior Judges looked at each and every QSO in minute detail and with some custom software. Without the ability to do intensive multiple human review of a decision like that, it is a "coin flip" - random. In this case the result had huge credibility.