Using OpenCV and Tensorflow to identify a Boggle board in an image and decode what letters are on the board.
You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
 
 

78 lines
1.9 KiB

program to process multiple videos
INPUT: a directory of videos
OUTPUT: videos json file
videos json file:
{
"123_456.mp4": {
"board": "QWERTYUIOPASDFGHJKLZXCVBN"
"frames": [frame, frame, frame, frame, ...],
},
"789_123.mp4": {
"board": "QWERTYUIOPASDFGHJKLZXCVBN"
"frames": [frame, frame, frame, frame, ...],
}
}
frame:
[letterImg, letterImg, letterImg, letterImg, ... x25]
letterImg:
~88x88 (or 86x86, etc., variable size) grayscale image
frame labelling program:
INPUT: videos json file
OUTPUT: labelled frames json file
labelled frames json file:
[
{
"state": "good"/"bad"/"ugly"/"discard",
"notes": "MPBRO was here",
"board": "QWERTYUIOPASDFGHJKLZXCVBN",
"video": "123_456.mp4",
"frameNum": 123,
"boardImg": frame #just 1 frame
}
]
meaning of "state"
good: all the letters are clearly legible and not cut off
bad: all the letters are blurry and/or faint from reflected light
ugly: all the "letters" are not even recognizable
discard: individual letters are a mix between good, bad, and ugly
letter labelling program:
INPUT: labelled frames json file (filtered, only "discard" used)
OUTPUT: labelled letters json file
labelled letters json file
[
{
"notes": "MPBRO was here",
"letter": "A",
"video": "123_456.mp4",
"frameNum": 123,
"col": 3,
"row", 2,
"letterImg": letter
}
]
machine learning dataset generator
INPUT: labelled frames json file(s), labelled letters json file(s), options of what to include, should it truncate ones with way too many letters
OUTPUT: dataset with labels, how many of each letter?
dataset with labels:
data = [letterImgScale, letterImgScale, letterImgScale, ...]
labels = [0, 26, 14, 5, 2, 8, ...] #0 = unknown, 1-26 = A-Z
letterImgScale:
30x30 grayscale image