🌱 Continuously Growing Engineer
With a passion for learning and growth, I have been running a technical blog since 2018 on topics like Kubernetes, PyTorch, and Airflow, achieving 1,500+ monthly active users (MAU). I have participated in 10+ study groups, enjoying exploring the core principles of technology and applying them to solve real-world challenges.
🚀 Efficiency-Maximizing Engineer
I automate inefficient repetitive tasks and seek efficient methods for operational improvement. I developed a Python backend for model serving and optimized pipelines to enhance efficiency. By utilizing GitLab CI and Kubernetes, I built pipelines and deployed using the sidecar pattern, resolving version control issues and significantly reducing build times.
🫂 Collaboration-Focused Engineer
I integrated project source codes using Git Submodule to create a seamless collaborative environment. I also introduced MLOps tools like DVC and MLflow into the team's workflow, greatly improving collaboration on data and models and fostering a productive team atmosphere.
Latest Updated 2024. 12. 28 (D+0)
Zerohertz
Developed backend infrastructure for WAPL, a collaboration platform, using microservices architecture (MSA) within an 11-member team.
Developed scheduling functionality utilizing a Netty-based in-house Java backend framework, supporting the efficient creation, deletion, and retrieval of schedules.
Managed the entire lifecycle of machine learning services (annotation, modeling, training, deployment) and oversaw Kubernetes-based IDC infrastructure within an 11-member team.
Researched and developed models for text, signature, and checkbox detection, as well as information extraction, for the AI optical character recognition (OCR) product TwinReader.
Developed a Python backend for model serving and optimized pipelines to enhance efficiency.
Streamlined the development process by creating a solution that resolved versioning challenges and reduced build times through the separation of backend dependencies.
Executed AI projects and proof of concept (PoC) to meet client specifications.
Researched and developed diagnostic models utilizing time-series and vision data for real-time monitoring of process conditions in industrial-scale manufacturing systems.
Published two SCI(E) Q1-ranked research papers on feature selection algorithms, advancing efficient machine learning methodologies in industrial applications.
Executed 10+ government and industry-funded projects, further enhancing research capabilities in machine learning.
Developed bearing condition diagnostic model and graphic user interface.
Developed virtual reality environments based on C++ and Unreal Engine.
To reduce time consumption and inefficiency from reimplementing commonly used functions, developed and published a custom Python library on PyPI and GitHub Releases to enhance efficiency and code reusability across projects.
Built a GitHub Actions-based CI/CD pipeline (migrated from Jenkins) to automate repetitive tasks such as formatting, unit testing, and deployment, streamlining the process for feature additions and bug fixes.
To prevent unnecessary deployments from non-production changes like documentation updates, implemented a detailed branching strategy on GitHub and set up dedicated pipelines for code segregation.
Simplified version tracking by building a pipeline using the GitHub API to automatically generate and publish release notes to GitHub Releases, improving transparency across development cycles.
Ensured easy access to comprehensive project guidelines and function usage by creating a Sphinx-based documentation pipeline, deploying it via GitHub Pages for consistent and up-to-date project documentation.
Packaged frequently used classes and functions within the model backend into a Python library to streamline development processes.
Utilized Docstring to document functions and classes, enhancing code clarity and team collaboration, while maintaining library integrity through type hints and PyTest.
Faced with significant compatibility issues due to inconsistent libraries and formats for model outputs, standardized the data format for preprocessing and model inference visualization, enabling consistent visualization and resolving unexpected compatibility problems.
Addressed inefficiencies in post-processing due to Python-native functions with high time complexity by optimizing them with Cython-native functions and improving time complexity. (inference time decreased by 74.12%)
Developed a unified class and inheritance structure for Triton Inference Server.
Developed models for document area detection, rotated document classification, and detection of text, signatures, and checkboxes, along with a Python backend for model deployment.
Integrated project source codes using Git Submodule to facilitate a smooth collaborative environment.
Implemented a pipeline using GitLab CI and Kubernetes to separate backend dependencies from code and weights, deploying through the Kubernetes sidecar pattern, which resolved versioning challenges and significantly reduced build times for the model backend.
Faced with excessive GPU usage during model deployment, resolved the issue by identifying and fixing a memory leak through GPU resource monitoring and logging. (GPU memory usage reduced by 47.9%)
Reduced inference time for a text detection model, where frequent calls made optimization critical, by utilizing TensorRT-based quantization. (inference time decreased by 87.31%)
Encountered low accuracy in document rotation classification, addressed by performing batch inference on image tensors rotated in four directions and averaging the results. (improved accuracy by 2.01%p)
Performed clustering, annotation, preprocessing, training, and deployment to develop a model for extracting information from a wide variety of trade document formats.
Faced with the challenge of categorizing large volumes of unstructured PDF documents, developed an AI OCR-based pipeline utilizing OCR results and LLM prompting to efficiently classify and sort documents. (achieved 93.75% accuracy)
To address the high time and cost demands of large-scale data annotation requiring expert knowledge, accelerated the process by implementing pre-labeling through an ML backend using Label Studio SDK, significantly reduced annotation time and costs.
Encountered difficulties in manually checking complex human errors during annotation review, developed a Streamlit-based GUI to allow easy detection and correction of these errors through simple configurations.
Developed a vehicle type classification model for PoC execution.
Conducted research and development models for filter, part recognition, repair type, and damage type, along with the model inference pipeline.
Developed a demo page using Streamlit and deployed it on Kubernetes.
Developed models for segmentation of burn areas and severity diagnosis in burn patients.
Designed and developed an API for model deployment using Triton Inference Server.
Installed Kubernetes using Kubeadm on an on-premise environment to enhance understanding of Kubernetes architecture and practical usage.
Secured deployed services by implementing HTTPS protocol and Google OAuth2 through Traefik.
Established GitOps by automating build and deployment processes using GitHub Actions and Argo CD.
Built a node status monitoring GUI leveraging Node Exporter, Prometheus, and Grafana.
Automated various tasks using Apache Airflow integrated with KubernetesPodOperator.
Set up a Docker image build and deployment pipeline using Jenkins and Kaniko.
Developed a YOLOv5 based logo segmentation model.
Constructed a model deployment server on Amazon EC2 Inf1.
Enhanced environment compatibility by removing python==3.7
from requirements.txt
, enabling broader setup compatibility for SPTSv2.
Corrected variable type mismatch by aligning the depths
variable to a list type for consistency with its default value, enhancing code clarity and reducing runtime errors.
Generalized configuration by implementing customizable parameters like max_length
in data loading and model setup, improving SPTSv2 adaptability for varied use cases.
Optimized memory usage in inference by adding the @torch.no_grad
decorator in predict.py
, significantly reducing GPU memory requirements.
Resolved IndexError
during training with customized data by fixing shape mismatches in GT data, ensuring stability in data handling.
Addressed tensor dimension errors and generalized prediction, evaluation, and visualization processes.
Identified and verified a dependency mismatch with Image.Resampling.BILINEAR
(Pillow >=9.1.0
).
Conducted version testing and recommended an update to Streamlit’s requirements, improving reliability for developers by preventing compatibility issues.
Added a motion.duration
parameter in Hexo's NexT theme _config.yml
, enabling flexible configuration of motion animation duration.
Modified source/js/motion.js
to retrieve motion.duration
dynamically, with a default value fallback for robustness.
Contributed minor wording refinements to improve grammatical accuracy.
Customized a technical blog based on the Hexo NexT theme to document and share solutions to challenges encountered during personal learning and professional work.
Achieved 1,500+ MAU and 2,600+ monthly page views by consistently writing 200+ posts since 2018.
Implemented an automated data ingestion and preprocessing pipeline using GitHub Actions to enhance data workflow efficiency.
Delivered insights to technical research personnel (전문연구요원) through data visualizations created with Matplotlib, supporting decision-making from multiple analytical perspectives.
Implemented an automated data ingestion and preprocessing pipeline using GitHub Actions to enhance data workflow efficiency.
Delivered insights to skilled industrial personnel (산업기능요원) through data visualizations created with Matplotlib, supporting decision-making from multiple analytical perspectives.
Inventor: Changwoo Lee, Minho Jo, Yoonjae Lee, Hyogeun Oh
Application number: 1020220017419 (2022.02.10)
Grant number: 1025842600000 (2023.09.25)
Author: Hyogeun Oh, Jaehyun Noh, Changbeom Joo, Gyoujin Cho, Jeongdai Jo, Changwoo Lee
Journal: Measurement [Impact Factor 5.60, JCR Top 17.19%]
Author: Hyogeun Oh, Yoonjae Lee, Jongsu Lee, Changbeom Joo, Changwoo Lee
Journal: Journal of Computational Design and Engineering [Impact Factor 6.16, JCR Top 10.87%]
Advisor: Changwoo Lee
GPA: 4.15 / 4.5
SiM Lab. (Smart intelligent Manufacturing system Laboratory)
권취 롤(copper film) 내부 응력 분포를 고려한 Web handling 불량 개선 방안 연구, SK 넥실리스 (2022. 10 ~ 2023. 01)
제품 운송 진동, 충격에 따른 가속도 데이터 특징 분석
미래형 센서를 위한 초정밀 대면적 생산시스템 전문 인력 양성, 한국연구재단 (2021. 09 ~ 2023. 01)
소형위성 분리용 어댑터 최적설계, 스페이스베이 (2022. 09 ~ 2023. 01)
모터 동특성 분석 모델 개발 및 최적화
이차전지전극을 위한 멀티코터가 구비된 지능형 롤투롤 코팅시스템 개발, 산업통상자원부 (2022. 05 ~ 2023. 01)
Roll-to-Roll 연속 공정 내 편심 롤 진단을 위한 신호 처리 및 모델 개발
대면적 고효율 기능성 필름 대량 생산을 위한 스마트 인쇄 전자 제조 기술 개발, 한국연구재단 (2021. 01 ~ 2022. 12)
Roll-to-Roll slot-die coating 공정의 코팅층 진단 모델 및 Graphic User Interface 개발
열주름 해석을 통한 보정 기술 및 정밀 장력제어 기술, LG 전자 (2022. 03 ~ 2022. 12)
롤 정렬도 불량에 따른 사행량 예측 모델 개발
웹 핸들링 기반 롤 배치 최적화 및 Tilting에 따른 사행/주름 분석, LG 에너지솔루션 (2022. 05 ~ 2022. 12)
롤 정렬도 불량에 따른 사행량 예측 모델 개발
자동차용 배터리 전극 소재 떨림 최소화를 위한 건조 시스템 해석 및 공기 부양 유닛 노즐 최적화, LG 에너지솔루션 (2022. 04 ~ 2022. 11)
소재 온도 분포 파악을 위한 열전도도 프로파일 개발
R2R 인쇄 유연컴퓨터개발 연구센터, 한국연구재단 (2021. 06 ~ 2022. 05)
Roll-to-Roll slot-die coating 공정의 meniscus vision data 기반 진행 방향 코팅층 두께 예측 모델 개발
머신러닝 기반의 지능형 친환경 머서라이징 시스템 실증, 한국산업기술평가관리원 (2021. 04 ~ 2022. 05)
Vision data 기반 머서라이징 공정 모니터링 시스템 개발
복합형상 부품가공용 스마트 컴팩트 라인센터 개발, 한국산업기술평가관리원 (2021. 01 ~ 2021. 12)
MRV Lab. (Medical Robotics and Virtual Reality Laboratory)
실감형 실내 사이클링을 위한 2자유도 실내 사이클링 플랫폼 및 가상현실기반 고품질 사이클링 시뮬레이션기술 개발, 중소벤처기업진흥공단 (2019. 05 ~ 2020. 05)
Unreal Engine 기반 가상 현실 환경 구축
스마트 커뮤니티 폴리싱 시스템(Googi) 개발, 한국연구재단 (2018. 06 ~ 2019. 11)
Unreal Engine 기반 가상 현실 환경 구축
Multi-Phase Data Configuration Approach for Defect Detection on Roll-to-Roll System Bearings with Massive Data
Author: Yoonjae Lee, Hyogeun Oh, Changwoo Lee
Conference: 한국정밀공학회, Daegu, Korea
Presented: 2022. 10
Optimization Algorithm of Bearing Condition Diagnosis Model Based on Feature Engineering
Author: Hyogeun Oh, Yoonjae Lee, Changwoo Lee
Conference: PRESM, Jeju, Korea
Presented: 2022. 07
Analysis of lateral behavior in drying system for roll–to–roll printed electronics based on computational fluid dynamics
Author: Minho Jo, Hyogeun Oh, Hojin Jeon, Joungbae Choi, Changwoo Lee
Conference: 대한기계학회, Busan, Korea
Presented: 2022. 05
Diagnosis of Roll-to-Roll Printed Electronic System Using a Separability Quantification Algorithm of Density-Based Feature Data
Author: Hyogeun Oh, Changwoo Lee
Conference: 소음진동공학회, Changwon, Korea
Presented: 2022. 05
Compactness-based Feature Engineering Algorithm for Diagnosing Driven Roll in Roll-to-Roll Continuous Process
Author: Hyogeun Oh, Joungbae Choi, Minjae Kim, Changwoo Lee
Conference: 한국정밀공학회, Jeju, Korea
Presented: 2022. 05
Meniscus image-based thickness prediction of coated layer in Roll-to-Roll slot-die coating processes
Author: Hyogeun Oh, Myeonghwan Yeo, Changwoo Lee
Conference: 한국유연인쇄전자학회, Hoengseong, Korea
Presented: 2021. 12
Optimization of Statistical Feature Variables for Fault Diagnosis on Roll-to-Roll System Spindle Bearings
Author: Yoonjae Lee, Myeonghwan Yeo, Hyogeun Oh, Changwoo Lee
Conference: 한국정밀공학회, Online, Korea
Presented: 2021. 11
Condition Diagnosis of Roll Eccentricity Disturbance in Roll-to-Roll Continuous Systems
Author: Hyogeun Oh, Yoonjae Lee, Byeonghui Park, Changwoo Lee
Conference: 한국정밀공학회, Online, Korea
Presented: 2021. 05
Diagnosis System for Ball Bearing Cage Defects using Fisher Discriminant Ratio
Author: Hyogeun Oh, Yoonjae Lee, Changwoo Lee
Conference: 한국정밀공학회, Online, Korea
Presented: 2020. 09