In August this year, Elsevier, one of the world’s major providers of scientific, technical, and medical information has announced their new AI Resource Center. The idea behind this was to offer free access to published research & other resources in the area of Artificial Intelligence.
Through its Artificial Intelligence Program, Elsevier wants to build a comprehensive global examination of artificial intelligence by combining semantic research with insights from AI experts and practitioners.
Today, Elsevier released a comprehensive report clarifying the scope and activity within the large field of artificial intelligence (AI). This work, as part of Elsevier’s Artificial Intelligence Program, is expected to help research leaders, policymakers, funders, and investors navigating and understanding AI. The comprehensive, 80-pages report contains useful insights into AI research, education, ethics, and trends.
According to Elsevier: “The report holds a special focus on delineating the field of AI bottom up by using AI technologies.”. It is available free of charge on Elsevier’s AI Resource Center.
The report is divided into 5 chapters: “Identifying Artificial Intelligence research”, “Artificial Intelligence: a multifaceted field”, “Artificial Intelligence research growth and regional trends”, “Artificial Intelligence education” and “The imperative role of ethics in Artificial Intelligence”.
In general, it tries to answer questions such as where AI is now and where it is headed and what are the global trends in this research area. It reveals seven AI research clusters and provides insights on growth and regional trends in AI research.
As part of Elsevier’s AI Resource Center, this report is expected to help a large number of people across different disciplines. According to Elsevier, the report is just a contribution to the wider dialogue around AI and cannot cover all aspects of the discussion. In any case, it represents a significant open-source contribution to the field of AI.
In the near future, a trip to the doctor’s in China might mean going to an AI-powered booth clinic. Ping An Good Doctor, China’s biggest online health care platform, plans to build ‘hundreds of thousands’ of its telephone booth-sized, AI-powered, unstuffed clinics all across China.
The company’s CEO announced their big plans for revolutionizing healthcare in China. The idea is to gather historical and symptoms data about the patient and provide a diagnosis plan to a specialist consultant doctor. The small AI-powered clinic will, in fact, connect the patients with the company’s in-house medical team in a more convenient way. Text and voice interaction between the patient and the AI agent will provide health-related data. A relevant diagnosis will then be given by a specialist and patients can buy their medicine from the smart drug-vending machine inside the clinic.
“We plan to build hundreds of thousands of these unstaffed clinics across the country in three years” – says Wang Tao, CEO and founder of Ping An Good Doctor.
A pilot program has already been run in part of Shanghai. The company has big ambitions regarding the employment of AI in healthcare. Previously, Ping An Good Doctor launched a smartphone app called “Good Doctor“ which provides diagnosis, treatment and online appointment booking. Together with the online health services platform the company also has an online medicine store. According to them, the company is oriented towards a “closed loop health care ecosystem” which should bring more convenient medical consultations.
China is facing a shortage of doctors recently. This is one way to bridge the gap using technology and Artificial Intelligence. Ping An Good Doctor’s unstuffed booth clinic might play a key role in filling this gap and improving health services across the country.
Waymo – a former Google project and a pioneer in autonomous vehicles, has officially launched the first commercial driverless taxi service. John Krafcik, Waymo’s Chief Executive Officer has announced Waymo One – the company’s first commercial self-driving service.
Back in 2009, Google started the development of its self-driving technology at the company’s secretive X lab run by co-founder Sergey Brin. The project, as well as the self-driving unit at Google, were evolving until December 2016 when the unit was renamed Waymo and made into its own separate division in Alphabet.
In the years to come, as a separate company, Waymo made a number of partnerships including with Fiat Chrysler Automobiles, Lyft, AutoNation, Intel, and Jaguar Land Rover. Its technology was mature enough to become the first company to receive a permit for operating fully driverless cars in October 2018.
Today, Waymo announced that they are taking a huge step forward. Their self-driving vehicles will be available at a call as Waymo’s first commercial service. According to Waymo, the cars will be available across several cities in the Metro Phoenix area, including Chandler, Tempe, Mesa, and Gilbert. Early riders – participants in their research program will be the first to access the service.
“We’ll first offer Waymo One to hundreds of early riders who have already been using our technology. Over time, we hope to make Waymo One available to even more members of the public as we add vehicles and drive in more places.“, says John Krafcik – Waymo’s CEO.
Customers will get a clean car and an assigned Waymo driver (with over 10 million miles of experience on public roads). Waymo says that they expect feedback as it will be vital for their future development. Together with the self-driving service, they provide a Waymo app to help and support riders during their trip. What’s it like to ride in Waymo One, you can read in Waymo’s blog post.
In conclusion, this is a big step for Waymo and a big step for the Autonomous Driving in general.
GANs can be taught to create (or generate) worlds similar to our own in any domain: images, music, speech, etc. Since 2014, a large number of improvements of GANs have been proposed, and GANs have achieved impressive results. Researchers from MIT-IBM Watson Lab have presented GAN Paint based on Dissecting GAN – the method to validate if an explicit representation of an object is present in an image (or feature map) from a hidden layer:
However, a question that is raised very often in ML is the lack of understanding of the methods developed and applied. Despite the success of GANs, visualization and understanding of GANs are very little explored fields in research.
A group of researchers led by David Bau have done the first systematic study for understanding the internal representations of GANs. In their paper, they present an analytic framework to visualize and understand GANs at the unit-, object-, and scene-level.
Their work resulted with a general method for visualizing and understanding GANs at different levels of abstraction, several practical applications enabled by their analytic framework and an open source interpretation tools for better understanding Generative Adversarial Network models.
From what we have seen so far, especially in the image domain, Generative Adversarial Networks can generate super realistic images from different domains. From this perspective, one might say that GANs have learned facts about a higher abstraction level – objects for example. However, there are cases where GANs fail terribly and produce some very unrealistic images. So, is there a way to explain at least these two cases? David Bau and his team tried to answer this question among a few others in their paper. They studied the internal representations of GANs and tried to understand how a GAN represents structures and relationships between objects (from the point of view of a human observer).
As the researchers mention in their paper, there has been previous work on visualizing and understanding deep neural networks but mostly for image classification tasks. Much less work has been done in visualization and understanding of generative models.
The main goal of the systematic analysis is to understand how objects such as trees are encoded by the internal representations of a GAN generator network. To do this, the researchers study the structure of a hidden representation given as a feature map. Their study is divided into two phases that they call: dissection and intervention.
Characterizing units by Dissection
The goal of the first phase – Dissection, is to validate if an explicit representation of an object is present in an image (or feature map) from a hidden layer. Moreover, the goal is to identify which classes from a dictionary of classes have such explicit representation.
To search for explicit representations of objects they quantify the spatial agreement between the unit thresholded feature map and a concept’s segmentation mask using intersection-over-union (IoU) measure. The result is called agreement, and it allows for individual units to be characterized. It allows to rank the concepts related to each unit and label each unit with the concept that matches it best.
Measuring causal relationships using Intervention
The second important question that was mentioned before is causality. Intervention – denoted as the second phase, seeks to estimate the causal effect of a set of units on a particular concept.
To measure this effect, in the intervention phase the impact of forcing units on (unit insertion) and off (unit ablation) is measured, again using segmentation masks. More precisely, a feature map’s units are forced on and off, and both resulting images from those two representations are segmented to obtain two segmentation masks. Finally, these masks are compared to measure the causal effect.
For the whole study, the researchers use three variants of Progressive GANs (Karras et al., 2018) trained on LSUN scene datasets. For the segmentation task, they use a recent image segmentation model (Xiao et al., 2018) trained on the ADE20K scene dataset.
An extensive analysis was done using the proposed framework for understanding and visualization of GANs. The first part – Dissection was used by the researchers for analyzing and comparing units across datasets, layers, and models, and locating artifact units.
A set of dominant object classes and the second part of the framework- intervention, were used to locate causal units that can remove and insert objects in different images. The results are presented in the paper, the supplementary material and a video were released demonstrating the interactive tool. Some of the results are shown in the figures below.
This is one of the first extensive studies that target the understanding and visualization of generative models. Focusing on the most popular generative model – Generative Adversarial Networks, this work reveals significant insights about generative models. One of the main findings is that the larger part of GAN representations can be interpreted. It shows that GAN’s internal representation encodes variables that have a causal effect on the generation of objects and realistic images.
Many researchers will potentially benefit from the insights that came out of this work and the proposed framework that will provide a basis for analysis, debugging and understanding of Generative Adversarial Network models.
Artificial Intelligence is one of the most trending topics in today’s date. We have witnessed a remarkable progress in the past decade and we can say that AI is transforming our world. Data scientists and engineers develop AI applications and solutions which can handle increasingly complex problems, many of which are helping to bridge the digital divide and create an inclusive society.
Brain Power, a company founded with the idea to address autism through a heads-up wearable computer (like Google Glass), released a range of apps that produce quick insights for the children with autism, their parents and teachers.
Their mission is “to build systems that empower children and adults all along the autism spectrum to teach themselves practical life skills, and assess their progress numerically”.
A new one in the series of apps called Emotion Charades is monitoring and measuring anxiety levels of children using GoogleGlass and AI. The person with autism sees an emoji floating on either side of someone’s face, then tilts their head to choose the one that matches the facial expression. The software monitors children’s activity and body language while playing. Then, the data is uploaded to the cloud where AI is used to give insights and quick feedback.
Behind Emotion Charades, as well as any other product of Brain Power there is extensive research, rigorous product and clinical testing, and acceptance by families and practitioners. Brain Power’s technology has had positive feedback from people on the spectrum, parents, and professionals; and is validated through published research and clinical trials.
In 2018 the US’s Centers for Disease Control and Prevention (CDC) determined that approximately 1 in 59 children is diagnosed with an autism spectrum disorder (ASD). For people with autism, technology can mean improved communication abilities and interaction skills. Therefore, it is of crucial importance, to use new technologies to offer help for people with autism to achieve their full potential and build an inclusive society.
Every day, Google processes more than 3.5 billion searches. That is, we must admit, an enormous amount of data coming from the Google search queries only. This search data contains a lot of particularly valuable information about the searchers.
Back in 2008, Google launched a new project that was supposed to take advantage of this data. “Google Flu Trends” was the name of the new project, that was about to use search queries data to forecast flu outbreaks. However, although the ambitions were high and the data was there, Google Flu Trends failed after a few years – in 2013.
Five years later, we see another attempt to use social network data to forecast influenza epidemics. In a pre-print paper, posted on Tuesday on arXiv, researchers from Finland reveal their method for predicting flu outbreaks using Artificial Intelligence and Instagram posts.
They proved their hypothesis that Instagram posts have a significant statistical correlation with flu outbreaks. In the paper, they explain their method that relies on Artificial Intelligence to correlate numbers of hashtag references in Instagram posts to the official incidences of flu as recorded by Finland’s National Institute for Health and Welfare.
Big (Instagram) Data
They report that they collected data from Instagram posts from 2012 to 2018, counting over 22,000 posts. All of the data collected was public data gathered by searching for hashtags with words such as “flu” and comparing the image content of posts showing boxes and bottles of flu drugs.
They used public health data to predict historical outbreaks of influenza viruses. Their method employs convolutional networks, such as Inception and Resnet together with a tree search algorithm called XGBoost. In their article, they show that the method is able to predict flu outbreaks in the final year of data, using only data from previous years.
This shows that social networks data holds significantly valuable information. However, we still have to be careful with our approaches to extracting this information and relying on it when making decisions. Also, there is the privacy concern present when dealing with public data, especially from social networks.
In 2016, Facebook announced new features towards a more accessible social network. They announced that Facebook will use Artificial Intelligence to provide text descriptions of photos for the visually impaired people. However, Instagram – Facebook-owned, photo and video-sharing social network did not have these features implemented back then.
Today, Instagram has announced that it will increase accessibility by introducing two new improvements to make it easier for people with visual impairments to use the social network.
Similarly to what Facebook has already had, the new AI-based feature will provide text descriptions for photos to be automatically generated. The so-called automatic alternative text will be generated using object recognition technology developed at Facebook.
Together with this, a second feature is added to Instagram – custom alternative text. This will allow people to add a richer description when uploading a photo to make Instagram more accessible. According to the announcement in Instagram’s blog post, if there is no custom alternative text added, only in that case Instagram will automatically generate alternative description using its AI-based feature.
There are more than 285 million visually impaired people in the world today. With Instagram being one of the most popular and fast-growing social media, a great number of people will benefit from these new features. The social network announced that these are only the first steps towards a more accessible Instagram. Many more are to be expected in the future.
Computer vision is an interdisciplinary field that has been gaining huge amounts of traction in recent years (since CNN), and self-driving cars have taken center stage. One of the most important part of computer vision is object detection. Object detection helps in solving the problem in pose estimation, vehicle detection, surveillance, etc.
The difference between object detection algorithms and classification algorithms is that in detection algorithms, we try to draw a bounding box around the object of interest to locate it within the image. With object detection, it is possible to draw many bounding boxes around different objects which represent different objects or may be same objects.
The main problem with standard convolutional network followed by a fully connected layer is that the size of the output layer is variable — not constant, which means the number of occurrences of the objects appears in the image is not fixed. A very simple approach to solving this problem would be to take different regions of interest from the image and use a CNN to classify the presence of the object within that region.
ImageNet is a dataset of over 15 million labeled high-resolution images belonging to roughly 22,000 categories. The images were collected from the web and labeled by human labelers using a crowd-sourcing tool like Amazon’s Mechanical Turk. Starting in 2010, as part of the Pascal Visual Object Challenge, an annual competition called the ImageNet Large-Scale Visual Recognition Challenge (ILSVRC2013) has been held. ILSVRC uses a subset of ImageNet with roughly 1000 images in each of 1000 categories.
At all, there are roughly 1.2 million training images, 50,000 validation images, and 150,000 testing images. ImageNet consists of variable-resolution images. Therefore, the images have been down-sampled to a fixed resolution of 256×256. Given a rectangular image, the image is rescaled and cropped out the central 256×256 patch from the resulting image.
The PASCAL VOC provides standardized image data sets for object class recognition. It also provides a standard set of tools for accessing the data sets and annotations, enables evaluation and comparison of different methods and ran challenges evaluating performance on object class recognition.
The goal of R-CNN is to take in an image, and correctly identify where the primary objects (via a bounding box) in the picture.
- Inputs: Image;
- Outputs: Bounding boxes and labels for every object in images.
R-CNN detection system consists of three modules. The first generates category-independent region proposals. These proposals identify the set of candidate detections present in an image. The second module is a deep convolutional neural network that extracts a feature vector from each region. The third module is a set of class-specific classifier i.e. linear SVMs.
R-CNN does what we might intuitively do as well – propose a bunch of boxes in the image and see if any of them correspond to an object. R-CNN creates these bounding boxes, or region proposals, using a process called Selective Search. At a high level, Selective Search (shown in Fig:1 below) looks at the image through windows of different sizes, and for each size tries to group adjacent pixels by texture, color, or intensity to identify objects.
As soon as the proposals are created, R-CNN enclosed the region to a standard square size and passed it through to a modified version of AlexNet. On the last layer of the CNN, R-CNN adds a Support Vector Machine (SVM) that classifies whether this is an object and if so what object. This is step 4 in the image above.
Improving the Bounding Boxes
- Inputs: sub-regions of the image corresponding to objects.
- Outputs: New bounding box coordinates for the object in the sub-region.
So, to summarize, R-CNN is just the following steps:
- Generate a set of region proposals for bounding boxes.
- Run the images in the bounding boxes through a pre-trained AlexNet and finally an SVM to see what object the image in the box is.
- Run the box through a linear regression model to output tighter coordinates for the box once the object has been classified.
Time taken to train the network is very huge as the network have to classify 2000 region proposals per image. It cannot be implemented real time as it takes around 47 seconds for each test image. The particular search algorithm is a fixed algorithm. Therefore, no learning is happening at that stage. This will lead to a generation of bad region of proposal.
R-CNN provides the state of the art results. Previous systems were complex ensembles combining multiple low-level image features with high-level context from object detectors and scene classifiers. R-CNN presents a simple and scalable object detection algorithm that gives a 30% relative improvement over the best previous results on ILSVRC2013.
R-CNN achieved this performance through two insights. The first is to apply high-capacity convolutional neural networks to bottom-up region proposals to localize and segment objects. The second is to train large CNNs when labels of training data are scarce. R-CNN results show that it is highly useful to pre-train the network with supervision.
US’s electronic commerce and cloud computing giant Amazon has announced that its “Machine Learning University” will now be available for free. A collection of more than 30 online courses will be made available for free through its subsidiary – Amazon Web Services (AWS).
The goal behind this step is to make Machine Learning available to all developers through AWS. Amazon expects to help developers, data scientists, data platform engineers, and business professionals in building more intelligent applications through machine learning.
“Regardless of where they are in their machine learning journey, one question I hear frequently from customers is: ‘how can we accelerate the growth of machine learning skills in our teams?’ ” – says Dr. Matt Wood, who made this big announcement in the name of Amazon.
“These courses, available as part of a new AWS Training and Certification Machine Learning offering, are now part of my answer.” – adds Dr. Wood, expressing enthusiasm that machine learning will become something very broadly available. Amazon expects that Machine Learning will go from something that was affordable only for big, well-funded organizations to become one of the main skills in all developers’ skillset.
The courses, which will be made available through Amazon’s “Machine Learning University”, include more than 45 hours of 30 self-service, self-paced digital courses and are provided for free. All the courses are organized such that each course starts with the fundamentals, and builds on those through real-world examples and labs, most of which are real problems encountered in Amazon.
The service offers specialized and tailored learning paths depending on your profession. Moreover, it offers an AWS certification that can help developers get recognized in the industry.
Amazon’s free machine learning courses are available here.