Sam S. Won

DS/ML Engineer


ABOUT ME


I'm a passionate developer and designer with over 8+ years of experience creating data science rooted solutions that make an impact. My journey began when I discovered the power of code to transform ideas into reality. With expertise in both front-end and back-end development, I bring a unique perspective to every project - thinking about user experience while ensuring technical excellence. I believe in continuous learning and pushing boundaries. My approach combines creative problem-solving with technical precision to deliver exceptional digital experiences. As a passionate and results-driven Data Science Engineer, I specialize in transforming complex data into actionable insights that drive business growth and innovation. With a strong foundation in machine learning, statistical analysis, and cloud computing, I thrive at the intersection of data, technology, and business strategy. My expertise lies in designing scalable data pipelines, building predictive models, and leveraging tools like Python, R, SQL, TensorFlow, and AWS to solve real-world problems. Throughout my career, I've focused on delivering impactful solutions that optimize operations, enhance user experiences, and unlock hidden value in data. Whether it's developing AI-driven recommendation systems, automating data workflows, or collaborating with cross-functional teams to align technical solutions with business goals, I bring a blend of analytical rigor and creative problem-solving to every project. I'm always eager to learn and adapt, staying updated with emerging trends in AI, cloud technologies, and data ethics to ensure my work remains cutting-edge and responsible.





LATEST DS/ML PROJECTS


A World for Every Child

Aug 2023 - Dec 2023
MVP Delivered

A World for Every Child aims to revolutionize childhood cancer diagnosis in under-resourced areas by harnessing advanced machine learning technologies. Our mission is to improve early and accurate detection of Acute Lymphoblastic Leukemia from blood smears, a critical step towards effective treatment. Despite the challenges of deploying this technology in developing nations, including resource limitations and FDA clearance requirements, our tool offers a quick, cost-effective preliminary diagnosis to aid doctors. By bridging the gap in global childhood cancer outcomes, we aim to introduce machine learning techniques to areas facing significant disparities in medical care, ultimately saving lives.

Project Link
Named Entity Recognition Model

Oct 2022 - Feb 2023
MMF Delivered

Developed a Named Entity Recognition (NER) model using the Spacy Python package to augment feature engineering for identifying customer-relevant data points. The model accurately identified entities such as names, locations, and organizations from unstructured text data, including customer feedback and reviews. By leveraging this NER model, we were able to extract valuable insights into customer behavior, preferences, and demographics, enabling more targeted marketing efforts and improved customer engagement. The model's accuracy was evaluated using standard metrics, resulting in a high precision rate of 95% and recall rate of 90%. This enhanced feature engineering capabilities.

Customer Sentiment Model

Mar 2022 - Jul 2022
MMF Delivered

Developed predictive modeling solutions that integrated multiple NLP techniques, including topic modeling, sentiment analysis, and keyword extraction. Combined these models to create a comprehensive understanding of customer interactions and preferences. Utilized TensorFlow ANN and LSTM layer to process large volumes of Nexidia and Genesys-derived transcripts, improving resource allocation by identifying high-priority calls and streamlining agent workflows. The solution also enhanced customer satisfaction through personalized responses and proactive issue resolution, resulting in increased efficiency and reduced wait times for customers. This pipeline improved overall call center operations, leading to enhanced customer experience and business outcomes.

CURRENT ROLE


  • Amgen Inc.  -  Thousand Oaks, CA

    Development Data & Analytics Manager  |   Jan 2025  -  Present

    As the Operational Design Analytics Team (ODA) within the Clinical Program Operations (CPO) function, I currently serve as the Operational Design Analytics Manager. I play a pivotal role in driving the success of clinical trials through data-driven approaches. As an ODA Manager, I work closely with the ODA Senior Team to facilitate the development of predictive and custom analytics deliverables. Additionally, I partner with other cross-functional data teams to drive efficiencies and ensure quality in our data ecosystem. My effective communication skills and ability to work effectively in a team environment will be essential in driving cross-functional collaboration and delivering results. My expertise in utilizing historical reference data sets and providing custom analytics will contribute to enhanced clinical operational planning and execution. My proficiency in data curation with an eye for data quality will enable me to help build predictive modeling datasets clinical teams can have confidence in. As an Operational Design Analytics Manager, I am responsible for driving innovation and continuous improvement in our data-driven approaches, staying up to date with the latest industry trends and best practices, and sharing my knowledge and expertise with the team. My passion for data and analytics, combined with my strong problem-solving skills and ability to think strategically, is key in driving the success of the clinical trials and delivering value to the patients.



    • In partnership with the ODA Senior Team, surface feasibility parameters, including historical trial performance metrics (enrollment rate, screen failure rate, drop-out rates, and completion rates) and other related clinical trial metadata.
    • Leverage internal data and predictive analytics tools to inform on study planning & placement decisions (examples of data sources/platforms include Trial Trove, Site Trove, Biomed Tracker, Cortellis Clinical Trial Intelligence, Study Optimizer, Data Query System (DQS))
    • Collate global site performance data to support recommendations for the most effective geographic footprint based on a study's specific requirements.
    • Surface potential enrollment challenges by quantifying how specific eligibility criteria can impact recruitment rates and trial durations.
    • Harmonization and mastering of various data sources utilized by cross-functional teams for study planning and predictive modeling.
    • Driving data quality and process improvement projects to accelerate efficiencies within the ODA Team.
    • Collaborate with cross-functional data colleagues to curate biologically driven training datasets for Machine Learning applications.
    • Utilize Racial and Ethnic Minority (REM) data insights to support the successful enrollment of the Diversity Action Plan (DAP) targets and DAP Mitigation Plans.

LATEST CONSULTING PROJECTS


Point of Sale iOS Application

Aug 2024 - Present
Under development

The Point of Sale (POS) iOS application is a crucial tool for a cafe business, streamlining operations and improving customer satisfaction. Developed using Dart, the app provides an intuitive interface for staff to process orders, track sales, and access business data on-the-go. The user-friendly design enables seamless navigation, making it easy to locate menu items, update inventory levels, and manage cash handling. Features like pick-up order management, loyalty programs, and real-time reporting provide valuable insights into cafe performance. With cloud-based architecture, staff have access to up-to-date information across all devices, enhancing operational efficiency and customer experience.

Thermal Printer API Server

Sept 2024 - Oct 2024
MVP Delivered

The FastAPI server is a high-performance web framework built using Python, designed to handle e-commerce demands. It integrates seamlessly with a physical thermal printer to generate receipts for customers. The API endpoints are crafted for robust scalability, handling orders efficiently and accurately. When an order is placed, the server triggers receipt generation, utilizing asyncio to manage concurrent requests. The thermal printer receives print data via network, ensuring seamless integration. This streamlined process enables fast and efficient delivery of purchases, while the FastAPI server handles a high volume of requests with precision and accuracy.

Bubble Tea Shop Website

July 2024 - Sept 2024
MMF Maintenance

The Bubble Tea Shop website is built using Flask, a lightweight Python web framework. The homepage showcases signature drinks with high-quality images, and a search bar allows users to find specific drinks or browse categories. A blog section shares recipes, promotions, and behind-the-scenes stories about the business. Users can easily order online by selecting their preferred drink, size, and flavor, with options for in-store pickup or delivery. The website features a loyalty program and social media feed, keeping customers updated on new menu items and events. Its modern design makes it easy to navigate and find what's needed.

Website Link


SPECIALIZATIONS


DS/ML

  • Python
  • R
  • Rust
  • GoLang
  • JavaScript

ML/DevOps

  • Docker
  • Kubernetes
  • Git
  • Jenkins

Web Dev

  • Python - Flask/Django
  • Dart - Flutter
  • iOS - Swift
  • JS - NodeJS/React

App Dev

  • Dart - Flutter
  • iOS - Swift
  • JS - Electron
  • Python - Flet
  • R - Shiny



EDUCATION



University of California, Berkeley  -  Berkeley, CA
Master of Information and Data Science
Dec 2023

Received a rigorous, interdisciplinary education focused on data science, machine learning, and information systems. The program emphasized hands-on learning through collaborative projects, real-world datasets, and partnerships with industry leaders. The coursework combined technical skills in programming, statistics, and AI with ethical considerations and societal impact, preparing to tackle complex data challenges. The program's emphasis on innovation and critical thinking, coupled with Berkeley's research-driven environment, equipped me with the tools to excel in data-driven roles across tech, healthcare, and beyond.

  • Machine Learning Systems Engineering
  • Machine Learning at Scale
  • Applied Machine Learning
  • Experiments and Causal Inference
  • Research Design and Applications for Data and Analysis

Capstone Project: A World for Every Child
The project aims to address global disparities in childhood cancer care by integrating machine learning applications into regions with limited medical infrastructure, ultimately bridging gaps in treatment outcomes; It uses advanced machine learning techniques to improve early and accurate diagnosis of Acute Lymphoblastic Leukemia (ALL) in under-resourced areas through blood smear analysis.



Montclair State University  -  Montclair, NJ
Bachelor of Science, Mathematics
June 2015

Developed strong analytical and problem-solving skills through rigorous coursework in advanced calculus, linear algebra, statistics, and computational methods. The program emphasized both theoretical foundations and practical applications, preparing to tackle complex mathematical problems in fields like data science, finance, and engineering.

  • Operations Research
  • Advanced Calculus
  • Statistics
  • Probability
  • Linear Algebra
  • Number Theory

  • Object Oriented Programming
    • Python
    • R
    • Java
  • Data Structures & Algorithms
  • Query Languages
    • SQL
    • NoSQL
    • GraphQL