Topic: Integrating OCR & OMR processes in Python code.
Honestly, I'm quite a noob to coding, but I am full of general curiosity and ideas I'd like to try to make. Please forgive my ignorance, if I ask wrong questions sometimes. It could be the reason I dont experience much of luck searching answers to my questions on the web. I have more than one question, so ideally I'd like to make it into a discussion, I'll add critical(for me) questions as the discussion unveils. Please do not be hesitant to make your input based on your knowledge or share your personal experience working on similar projects. No matter how much experience you've got relating to this topic, I believe there is little something in each, that I could learn from. - Just like a missing puzzle piece that will make up whole picture clear. Would like to clarify the goal is not to make OCR or OMR Python coded app / tool. But integrating these two processes as part of workflow - receiving data input (assinging to/ or generating elements /classes) that would help Py app to make further decisions.
8/13/2019 6:23:05 PMKorinth3D
9 AnswersNew Answer
I'm making a minesweeper bot. The sensor part (OMR), I did it entirely the manual way by checking for pixels. It works, so I didn't think much about it. But for your case, you might need to determine what you want to work on first. Handwritten text? Printed text? Some type of special marking? Each has its own approach and a toolset that better suits the purpose. Which sensors and actuators will you be working with? Android cameras? Apple cameras? PC cameras? Android screen touch simulations? Mouse click simulations? Physical rotors? I don't think I can help much until you have a very clear, or at least, a more defined vision for your project, then you can further abstract the workflow and the components. Nevertheless, I can see your learning passion is real and I hope you can make progress with whatever you are trying out. For OCR/OMR (computer vision might be the high-level term you are looking for), I suggest checking out OpenCV. There are a lot of guides out there for OpenCV and implementations are available for Python as well.
You want to do OCR and OMR right, then you have to take your photos from somewhere. Why should you take a screenshot if you can take a photo? Or I'm misunderstanding and you want to do OCR and OMR on some elements of another app? Because if you want to extract data from an app, there's a more systematic way to do it, namely reverse engineering. When OCR and OMR are mentioned, they always go together with taking pictures then analyzing, and taking pictures makes use of sensors. If you go for the python Android approach, I have no knowledge and cannot give you any help. Try creating a simple app to get the sense of the workflow.
The purpose of this discussion is to fill voids in theoretical understanding and main principles how Python operates. As advised by many programmers I heard from prog. language is taught through practice. - I am generally passionate about neuroscience, and I'd like to explore machine learning process eventually, once I get my head around all theoretical subjects related to AI. - so as my practice I am starting with simple task automatisation scripts, moving towards compiling them into basic bot. So OCR & OMR are the next topic I'd like to start to explore, as it would allow to explore different workflow for how Bot would interact with my device, or complete given tasks.
Q1. What existing applications/programs utilizes or rely on one of these 2 or both of these processes(OCR/OMR) - that could be taken as examples to study? Q1.1. Have you had personal experience coding something similar? - If so, did you manage to achieve you get desired/intended result? - Did you run into any difficulty that stalled or made your progress slow or work tidious ?
Hi Gloria Borger , Thank you for your response ! I'll be working on Android screen touch simulations. It will be Bot applications to which id grant access to use certain apps on my phone to complete some tasks automatically and provide a report / feedback. I understand for OMR I'd be manually creating library of pixel images - for this case would resolution matter ? If id use for different device with screen resolution that differs from device I originally create from and make library based on its resolution . The thing is I wouldn't want to prerecord coordinates for touch actions, I'd like this app to firstly scan screen for image and text ellements, to later decide about UI options, then make action based of acquired information about UI. Would it first have to make screen capture, .ake digital image and store it before analysing ? Or can such toolset do it 'in realtime, without making digital image ?
Gloria Borger Yes, Open CV is only option I know/have so far, looked at its documentation and have seen some reviews, tutorials showing it in action. - Guess I'll have to try it myself first and see if it works for me. I currently have PyCharm, Atom - Thought of doing it on PyCharm and Test on NoxPlayer emulator for a start before adopting to my device and test on it. Would you suggest to use different compiler? - Acquired 35+ different books on Pythonfrom begginer to advanced in past 2 weeks, obviously many touches same topics / some have little extra, but i wouldn't want to miss something out. Found also 2 video courses on how to adopt/convert Py2 code to Py3 , as Ive heard there is way more online documentations available on Python 2 than there is on Python 3. I am going to choose now an app I'll be testing it on/ developing it for and draw an outline to define general workflow of the script.
If you want your app to run on Android, take in data from Android sensors, analyze it on Android, then display on Android, then you will work with Java and develop on Android Studio. (I actually am not sure on this, I have been using Android Studio and Java to develop all Android apps, but maybe there is a way to develop an Android app in python?) If you only want to read the data from Android sensor, then process it on your computer, I think there should be a way to stream sensor data from Android to python. Resolution and camera sensor will matter a lot. Imagine one with very good camera that returns realistic images with few flaws compared to a low quality one that returns fisheyed images, your program would have to handle both cases. I suggest looking into convolutional neural networks, which should somewhat mitigates the issue of resolutions. CNNs are currently the state-of-the-art technique for anything image-related machine learning. But if OpenCV can handle that for you already, then it's good. You don't have to store it in local storage, you can do it realtime. But the degree of realtime depends on how good your algorithm is. If it takes a lot of time to analyze an image, then you would want to go the local storage approach, but if it is fast and takes just some microseconds, then you can do it in realtime.
I am not planning on using device sensors, such as camera, mic , gyro .. or have I understood wrongly word sonsors ? The digital image would be captured as a screenshot.
I've installed Kivy package and PyGame on PyCharm . That shluld do to make it into an App for android , no?