Connect with us


How feature stores can reduce the ‘Groundhog Day’ effect for data scientists



Splice Machine’s Monte Zweben explains how feature stores can help cut down the monotonous parts of a data scientist’s job.

Many people pursue a career in data science because they love solving problems. But something called the Groundhog Day effect can limit that, according to Monte Zweben, CEO of real-time AI company Splice Machine.

Zweben, who previously worked as the deputy chief of AI at NASA’s Ames Research Center and sits on the advisory board for Carnegie Mellon University’s School of Computer Science, believes feature stores can help.

Click here to check out the top sci-tech employers hiring right now.

‘Spending all your time on monotonous work can lead to unhappiness with the job’

Can you explain what the Groundhog Day effect is for data scientists?

Work as a data scientist follows a cycle: log in, clean data, define features, test and build a model. Except not all parts of the cycle are created equal; data preparation takes 80pc of any given data scientist’s time.

No matter what project you’re working on, most days you’re cleaning data and converting raw data into features that machine learning models can understand. The monotonous void of data prep blends hours together and makes each day identical to the one before it.

With one person, it’s annoying to have to repeat the same work all the time; with a team, each person building features slightly differently can lead to inconsistent results.

Does this effect pose an issue?

From a productivity perspective, it’s incredibly inefficient for one person to repeat their own work multiple times; that’s time and money spent on unnecessary tasks, which makes models slower to get up and running.

From an employee perspective, spending all your time on monotonous work can lead to unhappiness with the job and increase employee turnover. For the business as a whole, lacking a centralised data process can also lead to inconsistencies in business.

If different people are defining features differently across a company, this can cause models and business decisions to differ based on feature definitions. Lifetime value of a customer (LVC) is a great example. One team might define the lifetime value as a customer’s total past spending, while another might include the customer’s projected value in the LVC.

Inconsistent definitions can lead to preferential treatment in a company and affect customer retention in the long term.

What are feature stores? How can they benefit data workers?

A feature store is a shareable repository of features made to automate the input, tracking and governance of data into machine learning models. Feature stores compute and store features, enabling them to be registered, discovered, used and shared across a company.

A feature store makes sure features are always up to date for predictions and maintains the history of each feature’s values in a consistent manner, so that models can be easily trained and re-trained.

Feature stores enable total model transparency, guarantee consistent training and can serve models real-time updates of aggregate data sets.

How do feature stores work?

A feature store is a repository of features, feature sets, and feature values, along with their feature history. The feature store has a set of services that interact with this repository, which includes defining features, searching for features, retrieving the current value of features, associating meta-data with those features, defining a training set from groups of features, and backfilling new features into training sets.

In some implementations, feature stores have user interfaces that call those services, and in others they are just APIs.

Feature stores are fed by pipelines that transform raw data into features. These features can then be defined, declared into groups, and assigned meta-data that makes them easier to search for. Once the features are in the store, they are used to create training views, training sets and serve features. These mechanisms allow feature stores to automate data transformation, serve aggregate features in real time and monitor models in real time.

How would you recommend data workers get on board with feature stores? What tips would you give them?

My number-one recommendation is to prepare for the future. Even if you only have a few models in production right now, I’ve seen so many data workers struggle to scale an ad-hoc data architecture. Within 10 years, the most successful companies will have hundreds and thousands of machine learning models running simultaneously; this will be impossible to manage without a feature store.

If you’re on the fence, just try one out! They’re easy to use and will seriously change your data workflow in the best way possible. I personally recommend trying a SQL-powered feature store. Make sure you get a feature store that works on any cloud and on premise, too. You don’t want to get locked into a specific cloud that might be less competitive in the future.

Are there any resources on the topic you would recommend?

I’m actually writing a book on feature stores for machine learning with Manning Publishers (scheduled release in 2021). is a great central location for lots of information about feature stores. The Toward Data Science blog on Medium has some great content on feature stores, too.

Source link


SSD belonging to Euro-cloud Scaleway was stolen from back of a truck, then turned up on YouTube • The Register



In brief Deepmind and the European Bioinformatics Institute released a database of more than 350,000 3D protein structures predicted by the biz’s AI model AlphaFold.

That data covers the 20,000 or so proteins made in the human body, and is available for anyone to study. The proteomes of 20 other organisms, from Zebrafish to E.coli bacteria, are also in there, too, and hundreds of millions of more structures will be added over time, we’re told.

“In the hands of scientists around the world, this new protein almanac will enable and accelerate research that will advance our understanding of these building blocks of life,” said DeepMind’s CEO Demis Hassabis. He hopes that it will be a valuable resource that will be used in the discovery of new drugs and our understanding of diseases.

Source link

Continue Reading


Reid Hoffman to join board of electric air-taxi start-up Joby



Reid Hoffman. Image:

LinkedIn co-founder Reid Hoffman is helping to take Joby, which is being billed as ‘Tesla meets Uber in the air’, public through a SPAC deal.

Electric air-taxi start-up Joby Aviation will add Silicon Valley figure Reid Hoffman to its board as the company prepares to go public via a merger with a blank-cheque firm.

LinkedIn co-founder Hoffman, who is now a partner at venture capital firm Greylock, has a key connection to the 12-year-old start-up. Earlier this year, it was announced that Joby is going public through a $6.6bn reverse merger deal with Reinvent Technology Partners, the special purpose acquisition company (SPAC) Hoffman set up with Zynga founder Mark Pincus and investor Michael Thompson.

The deal is expected to close in this summer. Joby is the first aerial vehicle start-up to go public via the SPAC route, and the deal will provide the company with $1.6bn in cash.

SPACs have been growing in popularity this year as they can provide a quicker way of bringing a company public rather than the traditional route of an initial public offering.

Support Silicon Republic

Hoffman will be added by the Joby board once the deal is complete, alongside Google general counsel Halimah DeLaine Prado and former Southwest Airlines CFO Laura Wright.

Toyota Motor Corporation board member and operating officer James Kuffner and Zoox CEO Aicha Evans have already been added to the board in recent months.

“We are incredibly humbled to have been able to assemble such a remarkable and diverse group of world-class leaders to guide and support Joby as we plan to enter the public market,” said JoeBen Bevirt, Joby CEO and founder.

Joby acquired Uber’s Elevate flying car business at the end of December and now plans to begin a commercial passenger ‘air taxi’ service in 2024. Hoffman described the venture as “Tesla meets Uber in the air” in a recent interview.

The company will work with Toyota from its California-based manufacturing facility to build its electric vertical takeoff and landing (eVTOL) aircraft. Toyota led the company’s $620m Series C funding round last year, with other investors including Intel Capital and JetBlue Technology Ventures.

Source link

Continue Reading


Virtual contact worse than no contact for over-60s in lockdown, says study | Coronavirus



Virtual contact during the pandemic made many over-60s feel lonelier and more depressed than no contact at all, new research has found.

Many older people stayed in touch with family and friends during lockdown using the phone, video calls, and other forms of virtual contact. Zoom choirs, online book clubs and virtual bedtime stories with grandchildren helped many stave off isolation.

But the study, among the first to comparatively assess social interactions across households and mental wellbeing during the pandemic, found many older people experienced a greater increase in loneliness and long-term mental health disorders as a result of the switch to online socialising than those who spent the pandemic on their own.

“We were surprised by the finding that an older person who had only virtual contact during lockdown experienced greater loneliness and negative mental health impacts than an older person who had no contact with other people at all,” said Dr Yang Hu of Lancaster University, who co-wrote the report, published on Monday in Frontiers in Sociology.

“We were expecting that a virtual contact was better than total isolation but that doesn’t seem to have been the case for older people,” he added.

The problem, said Hu, was that older people unfamiliar with technology found it stressful to learn how to use it. But even those who were familiar with technology often found the extensive use of the medium over lockdown so stressful that it was more damaging to their mental health than simply coping with isolation and loneliness.

“Extensive exposure to digital means of communication can also cause burnout. The results are very consistent,” said Hu, who collected data from 5,148 people aged 60 or over in the UK and 1,391 in the US – both before and during the pandemic.

“It’s not only loneliness that was made worse by virtual contact, but general mental health: these people were more depressed, more isolated and felt more unhappy as a direct result of their use of virtual contact,” he said.

The report, Covid-19, Inter-household Contact and Mental Wellbeing Among Older Adults in the US and the UK, analysed national data from the UK’s Economic and Social Research Council-funded Understanding Society Covid-19 survey and the US Health and Retirement Study.

Hu said more emphasis needed to be placed on safe ways to have face-to-face contact in future emergencies. There must also, he added, be a drive to bolster the digital capacity of the older age groups.

“We need to have disaster preparedness,” he said. “We need to equip older people with the digital capacity to be able to use technology for the next time a disaster like this comes around.”

The findings outlined the limitations of a digital-only future and the promise of a digitally enhanced future in response to population ageing in the longer term, added Hu.

“Policymakers and practitioners need to take measures to pre-empt and mitigate the potential unintended implications of household-centred pandemic responses for mental wellbeing,” he said.

Caroline Abrahams, charity director at Age UK, welcomed the report. “We know the virtual environment can exacerbate those feelings of not actually being there with loved ones in person,” she said.

“It’s essential therefore that government makes preventing and tackling loneliness a top policy priority, backed up with adequate funding.

“It’s not over the top to point out that in the worst cases, loneliness can kill in the sense that it undermines resilience to health threats of many kinds, as well as leading to older people in the twilight of their lives losing all hope, so they lack a reason to carry on.”

Patrick Vernon, associate director at the Centre for Ageing Better, said he saw many examples of older people using technology to stay connected in “really positive ways”.

But he was also doubtful: “We know that even for those who are online, lack of skills and confidence can prevent people from using the internet in the ways that they’d like to.”

Previous research by the Centre for Ageing Better found that since the pandemic, there had been significant increases in the use of digital technology among those aged 50-70 years who were already online.

But there are still 3 million people across the UK who are offline, with a significant digital divide affecting low-income households. Twenty-seven per cent of people aged 50-70 with an annual household income under £25,000 were offline before the pandemic.

Vernon said: “Our research has found that some people who were offline found it difficult to connect with family, friends and neighbours during the pandemic – and even those who were online said technology didn’t compensate for missing out on physical social interactions.”

Source link

Continue Reading


Subscribe To Our Newsletter

Join our mailing list to receive the latest news and updates 
directly on your inbox.

You have Successfully Subscribed!