Scientific statement by Yoram Baram


All references correspond to my journal publication list, which also specifies names of much appreciated coauthors and collaborators.

Over the years, my scientific interest in dynamical systems has taken me from issues of identification, estimation and control to issues of adaptation, association and learning, providing me with a better understanding of the dynamical nature of memory, learning and behavior. My works on the role played by visual feedback in movement ([35], [40],[41],[44],[49]), and, more generally, on artificial neural networks ([28]-[54]) have lead me to the development of a closed-loop augmented reality apparatus for aiding people with movement disorders (US patent 6734834, May 11, 2004). The device has been clinically tested on patients with neurological disorders, such as Parkinson’s Disease (PD), Multiple Sclerosis (MS), Cerebral Palsy (CP), brain stroke (CVA) and Senile Gait (SG, or lower-body Parkinsonism), and found to significantly improve their walking abilities, without the side effects caused by medication and brain surgery. The improvement was particularly dramatic in patients with high disease severity, who are frozen in place unless highly medicated, but walked almost normally while using the device [55]. Video clips of patients performance when using the device may be viewed at . This work was the first of only three noted for their significance by the Evaluation Committee, appointed by the Technion President to assess the international standing and the research and study programs of the Faculty of Computer Science (R. Karp, R. E. Bryant and A. Pnueli, “The Technion Faculty of Computer Science,” Report of the Review Committee, submitted to the President of the Technion on May 7, 2000). Following my lectures at the Annual Meeting of the Israeli Neurological Society, Zichron Yaakov, 2000, and at the 53rd Annual Meeting of the American Academy of Neurology, Philadelphia, 2001, a meeting attended by 12,000 neurologists from around the world, news articles on this work appeared during 2001-2002 in many Israeli and foreign newspapers and on radio and television programs. I received the Research Award for the Best Platform Presentation in Research in Multiple Sclerosis of the 19th Annual Meeting of the Consortium of Multiple Sclerosis Centers, Orlando, Florida, June, 2005, where I presented my work on improving gait in MS patients by virtual reality cues [58] (there were 140 platform and poster presentations at this meeting, one of the leading in the field). This work has appeared as a full research article and a highlight in the January 28th, 2006 issue of the highly acclaimed medical journal Neurology.

In addition to the Visual Walker, I have recently completed the development of the Auditory Walker, which provides a complementary/alternative channel for vision impaired people. Like the Visual Walker, it is based on providing the patient with a sensory feedback signal, which helps regulate walking patterns. Clinical tests have shown significant improvement in the walking abilities of PD and MS patients using the device [65]. Studies on the effects of the visual and the auditory walkers on movement disorders patients are presently being conducted at several medical centers in Israel and North America. In August 2005 I received an Intel grant for this work.

A natural evolution of my earlier work on adaptive systems has led me to artificial neural networks, associative memories, machine learning and man-machine interfaces. The global minima of partially connected binary neural networks were characterized in [31], which was an invited paper in the first Special Issue on Neural Networks of the Proceedings of the IEEE. My studies on the information capacities of neural networks ([28], [31]-[34], [36], [38], [39]) have shown the benefits of sparse encoding for information storage and retrieval.  Second-order bounds on the basins of attraction and the convergence times of non-linear dynamical systems and neural networks ([37], [45], [48]) constitute performance bounds for associative memories, which are considerably tighter than previously known first-order bounds. I proposed new feedforward network architectures and learning algorithms, notably balancing [42] and random embedding [52], which yield highly efficient pattern recognition mechanisms. The maximum-entropy density estimation method [43], cited in the latest edition of the classical book by R.O. Duda and P.E. Hart “Pattern Classification”, has become the basis for several blind source separation and deconvolution methods. The manifold stochastic dynamics method for Bayesian learning [53] was one of only twenty five works selected for full oral presentation out of over 600 papers submitted to 1999 NIPS (Neural Information Processing Systems – the most prestigious and selective conference in the field). My recent work, introducing kernel polarization for learning [57], provides a low-complexity solution to the long-standing problem of optimizing the kernel parameters, which was previously done mostly by exhaustive search. In a subsequent paper [66], I proposed unmodulated learning, which, in conjunction with local kernel polarization, performs comparably to the best modulated learning machines (e.g., SVM), at a small fraction of the cost. These concepts are presently being extended to regression (process prediction) by a PhD student. The anticipation of head motions from EMG signals for virtual reality applications is proposed and analyzed in [59].

Much of my earlier work on stochastic dynamical systems was concerned with minimal realization and order reduction of stochastic systems, estimators and controllers. I proposed an information theoretic criterion for parameter set reduction [1], leading to new methods for model reduction and for fixed-gain estimator and controller design ([10], [19]). The significance of this approach is that adaptive controllers and estimators employing the information criterion remain stable under large parameter variation and model order reduction. This has direct implications on such problems as aircraft control within a large flight envelope. This work is cited in several fundamental textbooks, notably, P. E. Caines, “Linear Stochastic Systems” and P. S. Maybeck, “Stochastic Models, Estimation and Control”. In his plenary paper “Issues on Robust Adaptive Control” presented at the 16th World Congress of the International Federation of Automatic Control (IFAC), Prague, July 2005, Professor Emeritus Michael Athans of MIT, a legend of control theory (author of the classical textbook “Optimal Control”), cited this work 37 times, devoting to it a subsection and one of the two appendices of his paper, both titled “The Baram Proximity Measure”. A few years later I found those dual algebraic properties of linear systems, termed “estimability” and “regulability”, that guarantee strict error reduction in state estimation and strict cost reduction in feedback control, respectively ([21],[25],[26]). Until the introduction of these properties, which guarantee order minimality of the optimal state estimator and regulator, respectively, it was falsely believed that such order minimality is guaranteed by the previously proposed observability and controllability conditions (Kalman, 1960).

A chronological account of my other scientific contributions follows:

The consistency of system identification methods, when the true parameter is not a member of the search set, which has direct implications on the stability of adaptive controllers and state estimators under large parameter variation, was proved in [1]. An information criterion for the design and analysis of such systems was developed in [10], [19]. Statistical tests for linear stochastic model validation were proposed in [4]. Stochastic model realization and reduction methods were presented in [11], [15], [17], [20],  [21] , [23]. The structural properties of linear dynamical systems that allow strict error reduction in state estimation and strict cost reduction in feedback control were derived in [26]. The space of all linear predictors of minimal order for a stationary process was specified in terms of linear transformations on the associated Hankel matrix [29]. Closed forms for the Levinson coefficients of polynomial compensators and inverse systems were derived in [30]. Optimal base placement for function approximation by radial basis functions and neural networks was first proposed in [14] and [18], replacing previously used grid methods. Information storage in fractal neural networks was first proposed in [28]. The global minima of partially connected binary neural networks were characterized in [31]. Second-order bounds on the domain of attraction of a stable equilibrium point of nonlinear dynamical system and neural networks and on the rate of attraction within the estimated domain, derived in [37], improve previously known first-order bounds. Higher flexibility in the construction of such bounds was proposed in [45], where nested Lyapunov functions were used to define the domain of attraction. A class of feedback control functions which improve the convergence rates of nonlinear dynamical systems and neural networks was presented in [48] and applied in the solution of system design and combinatorial optimization problems.  Bounds on the information capacities of neural networks storing sparse binary random vectors, presented in [32] and [34], generalized and extend previously known bounds for non-sparse vectors. Sparsely encoded networks that can perform multiple memory tasks, such as pattern correction, association and sequence regeneration, were proposed in [36] and [38] and their information capacities were derived. The interaction between motion and vision was studied in ([35], [40], [41], [44]). The time to collision between an object and a moving observer was shown in [35] to be inversely proportional to the line integral of the velocity normal to the edge of the object’s projection on the image plane and an equivalent computation was shown to be performed by a diffusion mechanism, implemented by a locally connected linear network. Obstacle detection by a neural network from expansion patterns in the optical flow was proposed in [40] and [44]. The motion trajectory that guarantees a contrast between the expanding optical image of a three dimensional object and the stationary image of its background, was derived in [41]. It was shown in [43] that the probability density function of a random vector can be learned by maximizing the output entropy of a feed–forward network of sigmoidal units. Parameter optimization algorithms were developed along with convergence results. The intersection surface of the estimated densities was derived in [46] and used in solving classification problems. Another approach to the classification problem, employing sparse encoding and linear separation in high dimensional binary space, was proposed in [42].  A geometric approach to the analysis of classifier complexity and its relationship to classification accuracy was presented in [51]. The statistical benefit of deferred decision in classification was analyzed in [47]. The probabilistic complexity of random embedding was derived in [52], resolving one of the oldest and most fundamental problems in pattern recognition and neural networks, originally posed by Rosenblatt: how many randomly selected parameters are needed in order to solve a classification problem. Dynamical methods for Bayesian density sampling on function manifolds, improving so-called Monte-Carlo methods, were proposed in [53]. The active learning method, a powerful new approach to pattern recognition, was enhanced in [56] by allowing on-line transition within a group of classifiers so as to maximize performance.