My progress in implementing AGI
September 2, 2019
It's not perfect by a long shot yet but here - raster-to-vector conversion! So I can convert bitmaps with text to arbitrary-accuracy vector representation. Update Sept. 12: I noticed today that the latest scientistific data seem to agree with my approach: https://www.scientificamerican.com/article/no-bones-about-it-people-recognize-objects-by-visualizing-their-skeletons/
September 1, 2019
Major progress: The vectorization code works. The below image shows which segments will become vectors.Note how arbitrary accuracy can be chosen: The inner curve in the left @ symbol consists of five vectors, while the one on the right has eight vectors.
August 29, 2019
I've completed the final phase to prepare for vectorization. Segments of the same color will initially be individually vectorized as one vector-chain of adjacent pixels.
August 28, 2019
You can't have the best AGI without the best data, and you can't have the best data without OCR, because a lot of books are only available in scanned form. I've evaluated commercial OCR solutions of the highest publicly available quality, and even those that cost $28,000 are unacceptable to me. I guess that's why Google acquired/wrote their own - (I think they bought an OCR company, since they're pretty incompetent @Google), I refer to the Google Cloud Vision API. My OCR system is business-critical, so I'll write my own. Then, when there will be surprises, when I put millions of books through the system, I can actually fix the code. Is it a bad idea to "waste time", building an OCR system from scratch? Not at all - I'm downloading hundreds of Terabytes of data, meanwhile, and of course it takes a very long time to design the first stage of getting to AGI: Natural Language Processing (not to be confused with Natural Language Understanding!)
The dumb way to do OCR is to train a neural network on a bazillion different fonts or do bitmap comparisons. I'm doing OCR differently. If my AGI project fails, I'll make millions just selling the patents and working code to my upcoming OCR system. I've started with a skimmer: