Dr. Jianping Zhang is a Senior Managing Director at Ankura in the Washington, DC office. He has over 25 years of proven leadership and technical experiences in solving complex data analytics problems. Specifically, his experience covers artificial intelligence, machine learning, predictive analytics, big data analytics, text analytics, and software development. Jianping’s extensive experiences in analytics address clients’ complex and evolving data challenges. Jianping has led the efforts in developing AI, machine learning, and advanced analytics products and solutions for government and commercial problems in various industries including financial, legal, and high tech. Jianping’s innovative research and development work resulted in more than 100 peer reviewed technical publications and two US patents.
Jianping’s professional experience includes:
- Text analytics platform for legal industry: Leads the efforts in developing a text analytics platform for applications in the legal industry. The platform includes tools for text categorization (predictive coding), document clustering, near-duplication document detection, email threading, and paragraph search.
- Healthcare payment denial action recommendation tool: Developed a tool which applied machine learning techniques to build predictive models for recommending the best action on healthcare payment denials. The tool is being integrated into the medical payment management process.
- Detection of anomalous financial messages: Design and developed an anomaly detection product for detecting eight types of anomalous financial messages.
- Tool for detecting non-compliant online payments: Led efforts in developing an innovative entity matching algorithm in identifying non-compliant online payments.
- Detection of suspicious hotel reward point stays: Applied a predictive analytics technique and business rule method to build predictive models to identify suspicious hotel reward point stays.
- Sentiment analysis of social media and survey data for a bank: Applied sentiment analysis techniques to extract sentiments or opinions from social media posts and bank survey data about a list of financial products and services offered by a US bank.
- Development of methodology for vendor risk scoring: Led the technical efforts in design and development of a vendor risk scoring system designed to assign a risk score to each vendor according to a set of risk indicators derived from invoices, payments, and vendors and relationships between vendors.
- Predictive analytics for identifying fraudulent invoices: Applied a machine learning algorithm to learn priority rules for prioritizing investigations of suspicious invoices in a client project. These rules helped auditors prioritize which invoices to look at more closely and increase efficiency by focusing auditors on highly suspicious invoices.
- Rule induction system with scoring capability: Developed a rule induction (rule learning) system. This tool learns IF-THEN rules from a set of training data. An innovative rule-based scoring algorithm was developed. It has been applied to several predictive analytics applications.
- Detection of suspicious financial statements of public companies: As a technical lead, developed an innovative rule induction and scoring method and applied the method to build predictive models for identifying publicly held companies that may have failed to comply with accounting principles or violated securities laws.
- Tiger team member for solving problems of a fraud detection system: Assisted in solving emergent problems of Internal Revenue Service’s electronic fraud detection system.
- Assessment of operational risks of hedge funds and mutual funds: Built logistic regression models to assign risk scores to investment advisers of mutual funds. Applied statistical techniques such as regression and canonical correlation to assess operational risks of hedge funds using data about investment advisor and hedge fund performance and characteristic data.
- Web filtering tool: Led the effort in development of a web filtering tool using machine learning technologies. This tool is able to automatically identify websites with 14 types of objectionable content in four different languages.
- Web content categorization service: Led the effort in developing a web service for content categorization. Classifiers have been developed for a hierarchy of more than two dozen of topic content categories. A prototype with classifiers for more than sixty topic categories was also developed.
- Rule induction system for data with highly imbalanced class distribution: It was developed to learn rules from dataset with high imbalanced class distribution. It has been used by a government agency to detect frauds for more than five years.
- Discovery of patterns of suspicious vehicle border crossings: This project involved discovering common patterns of suspicious border crossings and accounted for the “needle in the haystack” distribution where less than 0.1% of crossings were suspicious. A novel scalable rule discovery algorithm (RLSD) for data with highly skewed distributions was developed.
- News & events
- LegalTech, 2018, “Empirical Evaluations Of Active Learning Strategies in Legal Document Review”
- In IEEE International Conference on Big Data, “Empirical Evaluations of Active Learning Strategies in Legal Document Review,” Boston, MA
- LegalTech, 2017, “Predictive Coding: Deconstructing the Secret Sauce”
- Caribbean Regional Compliance Association Conference, 2017, “The Future Is Here: AI and Compliance, 2017”
- IEEE International Conference on Big Data, “Empirical Evaluation of Preprocessing Parameters’ Impact on Predictive Coding’s Effectiveness,” Washington D.C.
Insights & innovation
- “Using Machine Learning on Legal Matters: Paying Attention to the Settings Behind the Curtain,”” Hastings Science and Technology Law Journal, Volume 10, Issue 1, 10/2017, with Keeling, R., Huber-Fliflet, N., Zhang, J., and Chhatwal, R.
- “Empirical Evaluations of Active Learning Strategies in Legal Document Review,” In Proceedings of IEEE International Conference on Big Data, 2017, with Huber-Fliflet, N. Zhang, J., Zhao, H., Keeling, R., and Chhatwal, R.
- “Empirical Evaluation of Preprocessing Parameters’ Impact on Predictive Coding’s Effectiveness,” In Proceedings of IEEE International Conference on Big Data, 2016, with Huber-Fliflet, N. Zhang, J., Zhao, H., Keeling, R., and Chhatwal, R.
- “An Empirical analysis of the training and feature set size in text categorization for e-Discovery,” In Proceedings of ICAIL 2013 Workshop on Standards for Using Predictive Coding, Machine Learning, and Other Advanced Search and Review Methods in E-Discovery, 2013, with Hadjarian, A., Zhang, J., and Cheng, S.