Ph. D. Associate Professor
State Key Laboratory of Novel Software Technology
Department of Computer Science and Technology, Nanjing University
Currently, I am an associate professor in Department of Computer Science and Technology of Nanjing University, member of the National Key Laboratory of Novel Software Technology. My research is supported by projects from National Natural Science Foundation of China (NSFC) and the Jiangsu Provincial Research Foundation for Basic Research.
I received my B.Sc. degree and Ph. D. in Computer Science in Jun. 2006 and Jun. 2012 from Nanjing University, respectively. I am a member of NLP Group since undergraduate in Sep. 2005, led by Prof. Jiajun Chen. During my Ph. D. study, I spent 11 months (from Oct. 2007 to Aug. 2008) as visiting student in NLC group, MSRA. I worked with Long Jiang in Chinese Couplet project for the first six months, and then switched to SMT team working with Mu Li, Henry Li and Dongdong Zhang. I also spent 12 months as a visiting student in InterACT lab, LTI, CMU, working with Prof. Stephan Vogel.
My research interests lie in the area of natural language processing (NLP), one of the hotest and foundamental challenges in artificial intellegence, which is to automatic understand and generate natural language texts. My group and I are working on problems such as lexical/syntactic/semantic analysis of natural language texts, machine translation, question answering, information extraction, etc. These problems are sometimes referred to as structural prediction problems, because they are generating complex outputs such as syntactic trees, or translations in another language. We are particularly interested in designing and applying statistical methods/models (e.g. deep learning with recurrent/convolution neural networks) for these problems. Currently, I mainly focus on the following topics.
- Structured Prediction in Machine Translation. We're investigating various methods to improve the learning process of machine translation, by designing structured objectives, exploring more possible translation candidates, incorporating syntactic information, bringing human into the learning loop, etc.
- Sequence-to-sequence Learning for Structured Prediction in NLP. We're currently working on improving the end-to-end learning framework for better structured predictions, including designing new neural architectures,constraining the learning process by multitask learning, employing reinforcement learning to take undecided actions, etc.
- Distributed Representation and Learning for NLP. We're interested in learning the representations of grammatical units such as words, phrases, sentences, documents, etc. Special attention has been paid on learning the representation of less-frequently occured units.
- Question Answering. We're building systems that could automatically answer questions with the help of a knowledge database of entities and relations. Challenges includes mention detection, entity linking, relation detection, etc.
in Spring semester, for undergraduate student.
in Autumn semester, for graduate student.
Rongxiang Weng, Shujian Huang*, Zaixiang Zheng, Xinyu Dai and Jiajun Chen. Neural Machine Translation with Word Predictions. accepted by EMNLP'2017.
Huadong Chen, Shujian Huang*, David Chiang, XIN-YU DAI and Jiajun CHEN. Top-rank Enhanced Listwise Optimization for Statistical Machine Translation. accepted by CoNLL2017.
Jianbing Zhang, Yixin Sun, Shujian Huang*, Cam-Tu Nguyen, Xiaoliang Wang, Xinyu Dai, Jiajun Chen, Yang Yu. AGRA: An Analysis-Generation-Ranking Framework for Automatic Abbreviation from Paper Titles. accepted by IJCAI2017.
Huadong Chen, Shujian Huang*, David Chiang, Jiajun Chen. Improved Neural Machine Translation with a Syntax-Aware Encoder and Decoder. accepted by ACL2017.
Hao Zhou, Zhaopeng Tu, Shujian Huang, Xiaohua Liu, Hang Li and Jiajun Chen. Chunk-based Bi-Scale Decoder for Neural Machine Translation. accepted by ACL2017 (Short paper).
Hao Zhou, Yue Zhang, Chuan Chen, Shujian Huang, Xin-Yu Dai, and Jiajun Chen. A Neural Probabilistic Structured-Prediction Method for Transition-Based Natural Language Processing. Journal of Artificial Intelligence Research (JAIR), Volume 58, pages 703-729.
Hao Zhou, Yue Zhang, Shujian Huang, Junsheng Zhou, XIN-YU DAI and Jiajun Chen. A Search-Based Dynamic Reranking Model for Dependency Parsing. in Proceedings of 54th Annual Meeting of the Association for Computational Linguistics (ACL 2016), pages 1393-1402，Berlin, Germany, August 7-12, 2016.
Shujian Huang, Huifeng Sun, Chengqi Zhao, Jinsong Su, Xinyu DAI and Jiajun Chen. Tree-state based Rule Selection Models for Hierarchical Phrase-based Machine Translation. in Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence (IJCAI 2016)，pages 2817-2823, New York, USA, July 9-15, 2016.
Shanbo Cheng, Shujian Huang, Huadong Chen, Xinyu DAI and Jiajun Chen. PRIMT: A Pick-Revise Framework for Interactive Machine Translation. in Proceedings of NAACL-HLT 2016, pages 1240-1249, San Diego, California, June 12-17, 2016.
Hao Zhou, Yue Zhang, Shujian Huang, Xin-Yu Dai, and Jiajun Chen. Evaluating a Deterministic Shift-Reduce Neural Parser for Constituent Parsing. in Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016), pages 659-663, Slovenia, Portoroz, May 23-28, 2016.
Yinggong Zhao, Shujian Huang*, Xinyu Dai, and Jiajun Chen. Adaptation of Language Models for SMT Using Neural Networks with Topic Information. ACM Transactions on Asian and Low-Resource Language Information Processing (ACM TALLIP), 2016, 15(3): 19:1-19:15.
Hao Zhou, Shujian Huang*, Junsheng Zhou, Yue Zhang, Huadong Chen, Xinyu Dai, Chuan Cheng, Jiajun Chen. Enhancing Shift-Reduce Constituent Parsing with Action N-Gram Model. ACM Transactions on Asian and Low-Resource Language Information Processing (ACM TALLIP), 2016, 15(3): 13:1-13:17.
Yichu Zhou, Shujian Huang, Xinyu Dai, Jiajun Chen. Resolving Coordinate Structures for Chinese Constituent Parsing. in Natural Language Processing and Chinese Computing, J. Li et al. (Eds.): NLPCC 2015, LNAI 9362, pp. 353-361, Springer International Publishing, 2015.
Liqiang Niu, Xin-Yu Dai, Shujian Huang, and Jiajun Chen. A Unified Framework for Jointly Learning Distributed Representations of Word and Attributes. In Proceedings of 7th Asian Conference on Machine Learning (ACML 2015) November 20-22, 2015, Hong Kong, JMLR: Workshop and Conference Proceedings 45:143-156, 2015
Jinsong Su, Deyi Xiong, Shujian Huang, Xianpei Han, Junfeng Yao. Graph-Based Collective Lexical Selection for Statistical Machine Translation. in Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing (EMNLP2015), pages 1238-1247, Lisbon, Portugal, 17-21 September 2015.
Shujian Huang, Huadong Chen, Xinyu Dai, Jiajun Chen. Non-linear Learning for Statistical Machine Translation. in Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (ACL2015), pages 825-835, Beijing, China, July 26-31, 2015.
Hao Zhou, Yue Zhang, Shujian Huang, Jiajun Chen. A Neural Probabilistic Structured-Prediction Model for Transition-Based Dependency Parsing. in Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (ACL2015), pages 1213-1222, Beijing, China, July 26-31, 2015.
G. Hu, X. Dai, S. Huang, J. Chen. A Synthetic Approach for Recommendation: Combining Ratings, Social Relations, and Reviews. in Proceedings of the Twenty-Fourth International Joint Conference on Artificial Intelligence (IJCAI 2015), pages 1756-1762.
Xin-Yu Dai, Chuan Cheng, Shujian Huang, and Jiajun Chen. Sentiment Classification with Graph Sparsity Regularization. Computational Linguistics and Intelligent Text Processing. Springer International Publishing, 2015. 140-151
X. Dai, J. Zhang, S. Huang, J. Chen, and Z. Zhou. Structured sparsity with group-graph regularization. In: Proceedings of the 29th AAAI Conference on Artificial Intelligence (AAAI'15), Austin, TX, 2015
Yinggong Zhao, Shujian Huang, Xinyu Dai, Jianbing Zhang, Jiajun Chen. Learning Word Embeddings from Dependency Relations. International Conference on Asian Language Processing 2014 (IALP 2014), October 20-23, 2014, Sarawak, Malaysia
Yinggong Zhao, Shujian Huang, Huadong Chen, and Jiajun Chen. An Investigation on Statistical Machine Translation with Neural Language Models. CCL and NLP-NABD 2014, pp. 175-186, October 18-19, 2014, Wuhan, China
N. Xi, X. Dai, S. Huang, and J. Chen. Discriminative Word Alignment over Multiple Word Segmentations. Chinese Journal of Electronics, 2014 Vol. 23 (CJE-2): 263-270
HUANG Shujian, DAI Xinyu, CHEN Jiajun. Hypothesis Pruning in Learning Word Alignment. Chinese Journal of Electronics, 2013 Vol. 22 (CJE-1).
Qiang Fu, Xinyu Dai, Shujian Huang, and Jiajun Chen. Forgetting word segmentation in Chinese text classification with L1-regularized logistic regression. The 17th Pacific-Asia Conference on Knowledge Discovery and Data Mining, Part II, LNAI 7819, pp. 245--255. Gold Coast, Australia, 14-17 Apr. 2013.
Ning Xi, Guangchao Tang, Xinyu Dai, Shujian Huang, Jiajun Chen. Enhancing Statistical Machine Translation with Character Alignment. in Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics (short paper), Jeju Island, Korea, July 8 - 14, 2012.
Shujian Huang, Stephan Vogel and Jiajun Chen. Dealing with Spurious Ambiguity in Learning ITG-based Word Alignment. in ACL:HLT 2011:shortpaper, Portland, Oregon, USA, June 19-24, 2011.
Qiufeng Wu, Shujian Huang, Xinyu Dai and Jiajun Chen. A Syntax-based Pre-reordering Method for Chinese-English Machine Translation. in Proceedings of the 12th Chinese National Conference on Computational Linguistics (CNCCL-2011), Luoyang, China, June 2010. (In Chinese)
Shujian Huang, Kangxi Li, Xinyu Dai and Jiajun Chen. Improving Word Alignment by Semi-supervised Ensemble. in CoNLL 2010. Uppsala, Sweden, July 11-16, 2010.
Yabing Zhang, Junsheng Zhou, Shujian Huang and Jiajun Chen. Combining ILP and MLN for Coreference Resolution. in International Conference on Asian Language Processing (IALP 2009), Singapore, Dec 7-9, 2009
Biping MENG, Shujian Huang, Xinyu Dai and Jiajun Chen. Segmenting Long Sentence Pairs for Statistical Machine Translation. in International Conference on Asian Language Processing (IALP 2009), Singapore, Dec 7-9, 2009
Liu Weipeng, Zhou Junsheng, Huang Shujian and Chen Jiajun. Global Optimization Based On Clustering for Coreference Resolution. in The 10th Chinese National Conference on Computational Linguistics (CNCCL-2009), Yantai, China, July 24-26, 2009. (In Chinese)
Shujian Huang, Ning Xi, Yinggong Zhao, Xinyu Dai, Jiajun Chen. An Error-Sensitive Metric for Word Alignment in Phrase-based SMT. Journal of Chinese Information Processing, 2009, vol. 23, no. 3. (Revised version of CWMT2008 paper, In Chinese)
Shujian Huang, Yabing Zhang, Junsheng Zhou, Jiajun Chen. Coreference Resolution using Markov Logic Networks. (Best Poster Award (1/25) in The 10th International Conference on Intelligent Text Processing and Computational Linguistics (CICLing'2009), Mexico city, Mexico, 2009 poster; in Research in Computing Science: Advances in Computational Linguistics, Alexander Gelbukh Ed., vol. 41, page 157~168, ISSN: 1870-4069)
Junsheng Zhou, Shujian Huang, Jiajun Chen and Weiguang Qu. A New Graph Clustering Algorithm for Chinese Noun Phrase Coreference Resolution. Journal of Chinese Information Processing, 2007, vol. 21, no. 2. (In Chinese)
Shujian Huang, Yinggong Zhao, Boyuan Li, Qiufeng Wu, Xinyu Dai, Jiajun Chen. Nanjing University's System Report for NIST MT09 Workshop. Included in the materials of NIST Open MT 2009, Ottawa, ON, Canada. Aug 31-Sep 1, 2009.
Shujian Huang, Yinggong Zhao, Boyuan Li, Qiufeng Wu, Xinyu Dai, Jiajun Chen. NJU-NLP's Technique Report for the 5th China Workshop on Machine Translation. Included in the materials of CWMT 2009, Nanjing, China. Oct. 15-16, 2009. (In Chinese)
Last Update 2017-07-26