Connect with us

Technology

Like GitHub Copilot without Microsoft telemetry • The Register

Voice Of EU

Published

on

Updated GitHub Copilot, one of several recent tools for generating programming code suggestions with the help of AI models, remains problematic for some users due to licensing concerns and to the telemetry the software sends back to the Microsoft-owned company.

So Brendan Dolan-Gavitt, assistant professor in the computer science and engineering department at NYU Tandon in the US, has released FauxPilot, an alternative to Copilot that runs locally without phoning home to the Microsoft mothership.

Copilot relies on OpenAI Codex, a natural language-to-code system based on GPT-3 that was trained on “billions of lines of public code” in GitHub repositories. That has made advocates of free and open source software (FOSS) uncomfortable because Microsoft and GitHub have failed to specify exactly which repositories informed Codex.

As Bradley Kuhn, policy fellow at the Software Freedom Conservancy (SFC), wrote in a blog post earlier this year, “Copilot leaves copyleft compliance as an exercise for the user. Users likely face growing liability that only increases as Copilot improves. Users currently have no methods besides serendipity and educated guesses to know whether Copilot’s output is copyrighted by someone else.”

Shortly after GitHub Copilot became commercially available, the SFC urged<?a> open-source maintainers not to use GitHub in part due to its refusal to address concerns about Copilot.

Not a perfect world

FauxPilot doesn’t use Codex. It relies on Salesforce’s CodeGen model. However, that’s unlikely to appease FOSS advocates because CodeGen was also trained using public open-source code without regard to the nuances of different licenses.

“The models that it’s using right now are ones that were trained by Salesforce, and they were again, trained basically on all of GitHub public code,” explained Dolan-Gavitt in a phone interview with The Register. “So there are some issues still there, potentially with licensing, that wouldn’t be resolved by this.

“On the other hand, if someone with enough compute power came along and said, ‘I’m going to train a model that’s only trained on GPL code or has a license that lets me reuse it without attribution’ or something like that, then they could train their model, drop that model into FauxPilot, and use that model instead.”

For Dolan-Gavitt, the primary goal of FauxPilot is to provide a way to run the AI assistance software on-premises.

“There are people who have privacy concerns, or maybe, in the case of work, some corporate policies that prevent them from sending their code to a third-party, and that definitely is helped by being able to run it locally,” he explained.

GitHub, in its description of what data Copilot collects, describes an option to disable the collection of Code Snippets Data, which includes “source code that you are editing, related files and other files open in the same IDE or editor, URLs of repositories and files paths.”

But doing so does not appear to disable the gathering of User Engagement Data – “user edit actions like completions accepted and dismissed, and error and general usage data to identify metrics like latency and features engagement” and potentially “personal data, such as pseudonymous identifiers.”

Dolan-Gavitt said he sees FauxPilot as a research platform.

“One thing that we want to do is train code models that hopefully output more secure code,” he explained. “And once we do that, we’ll want to be able to test them and maybe even test them with actual users using something like Copilot but with our own models. So that was kind of motivation.”

Doing so, however, has some challenges. “At the moment, it’s somewhat impractical to try and create a dataset that doesn’t have any security vulnerabilities because the models are really data hungry,” said Dolan-Gavitt.

“So they want lots and lots of code to train on. But we don’t have very good or foolproof ways of ensuring that code is bug free. So it would be an immense amount of work to try and curate a data set that was free of security vulnerabilities.”

Nonetheless, Dolan-Gavitt, who co-authored a paper on the insecurity of Copilot code suggestions, finds AI assistance useful enough to stick with it.

“My personal feeling on this is I’ve had Copilot turned on basically since it came out last summer,” he explained. “I do find it really useful. That said, I do kind of have to double check its work. But often, it’s often easier for me at least to start with something that it gives me and then edit it into correctness than to try to create it from scratch.” ®

Updated to add

Dolan-Gavitt has warned us that if you use FauxPilot with the official Visual Studio Code Copilot extension, the latter will still send telemetry, though not code completion requests, to GitHub and Microsoft.

“Once we have our own VSCode extension working … that issue will be solved,” he said. This custom extension needs to be updated now that the InlineCompletion API has been finalized by the Windows giant.

So essentially, the base FauxPilot doesn’t phone home to Redmond, though if you want a completely non-Microsoft experience, you’ll have to grab the project’s extension when it’s ready, if you’re using FauxPilot with Visual Studio Code.

Source link

Technology

UK govt wants criminal migrants to scan their faces each day • The Register

Voice Of EU

Published

on

In brief The UK’s Home Office and Ministry of Justice want migrants with criminal convictions to scan their faces up to five times a day using a smartwatch kitted out with facial-recognition software.

Plans for wrist-worn face-scanning devices were discussed in a data protection impact assessment report from the Home Office. Officials called for “daily monitoring of individuals subject to immigration control,” according to The Guardian this week, and suggested any such entrants to the UK should wear fitted ankle tags or smartwatches at all times.

In May, the British government awarded a contract worth £6 million to Buddi Limited, makers of a wristband used to monitor older folks at risk of falling. Buddi appears to be tasked with developing a device capable of taking images of migrants to be sent to law enforcement to scan.

Location data will also be beamed back. Up to five images will be sent every day, allowing officials to track known criminals’ whereabouts. Only foreign-national offenders, who have been convicted of a criminal offense, will be targeted, it is claimed. The data will be shared with the Ministry of Justice and the Home Office, it’s said.

“The Home Office is still not clear how long individuals will remain on monitoring,” commented Monish Bhatia, a lecturer in criminology at Birkbeck, University of London.

“They have not provided any evidence to show why electronic monitoring is necessary or demonstrated that tags make individuals comply with immigration rules better. What we need is humane, non-degrading, community-based solutions.”

Amazon’s machine-learning scientists have shared some info on their work developing multilingual language models that can take themes and context gained in one language and apply that knowledge generally in another language without any extra training.

For this technology demonstration, they built a 20-billion-parameter transformer-based system, dubbed the Alexa Teacher Model or AlexaTM, and fed it terabytes of text scraped from the internet in Arabic, English, French, German, Hindi, Italian, Japanese, Marathi, Portuguese, Spanish, Tamil, and Telugu.

It’s hoped this research will help them add capabilities to models like the ones powering Amazon’s smart assistant Alexa, and have this functionality automatically supported in multiple languages, saving them time and energy.

Talk to Meta’s AI chatbot

Meta has rolled out its latest version of its machine-learning-powered language model virtual assistant, Blenderbot 3, and put it on the internet for anyone to chat with.

Traditionally this kind of thing hasn’t ended well, as Microsoft’s Tay bot showed in 2016 when web trolls found the correct phrase to use to make the software pick up and repeat new words, such as Nazi sentiments.

People just like to screw around with bots to make them do stuff that will generate controversy – or perhaps even just use the software as intended and it goes off the rails all by itself. Meta’s prepared for this and is using the experiment to try out ways to block offensive material.

“Developing continual learning techniques also poses extra challenges, as not all people who use chatbots are well-intentioned, and some may employ toxic or otherwise harmful language that we do not want BlenderBot 3 to mimic,” it said. “Our new research attempts to address these issues.

Meta will collect information about your browser and your device through cookies if you try out the model; you can decide whether you want the conversations logged by the Facebook parent. Be warned, however, Meta may publish what you type into the software in a public dataset. 

“We collect technical information about your browser or device, including through the use of cookies, but we use that information only to provide the tool and for analytics purposes to see how individuals interact on our website,” it said in a FAQ. 

“If we publicly release a data set of contributed conversations, the publicly released dataset will not associate contributed conversations with the contributor’s name, login credentials, browser or device data, or any other personally identifiable information. Please be sure you are okay with how we’ll use the conversation as specified below before you consent to contributing to research.”

Reversing facial recognition bans

More US cities have passed bills allowing police to use facial-recognition software after previous ordinances were passed limiting the technology.

CNN reported that local authorities in New Orleans, Louisiana, and in the state of Virginia, are among some that have changed their minds about banning facial recognition. The software is risky in the hands of law enforcement, where the consequences of a mistaken identification are harmful. The technology can misidentifying people of color, for instance.

Those concerns, however, don’t seem to have put officials off from using such systems. Some have even voted to approve its use by local police departments when they previously were against it.

Adam Schwartz, a senior staff attorney at the Electronic Frontier Foundation, told CNN “the pendulum has swung a bit more in the law-and-order direction.”

Scott Surovell, a state senator in Virginia, said law enforcement should be transparent about how they use facial recognition, and that there should be limits in place to mitigate harm. Police may run the software to find new leads in cases, for example, he said, but should not be able to use the data to arrest someone without conducting investigations first. 

“I think it’s important for the public to have faith in how law enforcement is doing their job, that these technologies be regulated and there be a level of transparency about their use so people can assess for themselves whether it’s accurate and or being abused,” he said. ®

Source link

Continue Reading

Technology

An open-source data platform helping satellites get to orbit

Voice Of EU

Published

on

VictoriaMetrics’ data monitoring platform will be used by Open Cosmos as it looks to launch low-Earth orbit satellites.

A Ukrainian start-up that provides monitoring services for companies has taken on a new task – helping to get satellites into orbit.

VictoriaMetrics has developed an open-source time series database and monitoring platform.

Founded in 2018 by former engineers of Google, Cloudflare and Lyft, the company said it has seen “unprecedented growth” in the last year. It surpassed 50m downloads in April and has gained customers include Grammarly, Wix, Adidas and Brandwatch.

Now, VictoriaMetrics is teaming up with UK-based space-tech company Open Cosmos to power the launch of its low-Earth orbit satellites.

Helping launch satellites

VictoriaMetrics said its services address the needs of organisations with increasingly complex data volumes and the demand for better observability platforms. Designed to be scalable for a wide variety of sectors, it offers a free version of its service and a paid enterprise option for those who want custom features and priority support.

Open Cosmos specialises in satellite manufacturing, testing, launch and in-orbit exploitation. It needed an application that could provide insights into the data powering its satellites.

The space-tech business has now integrated the VictoriaMetrics platform into its mission-critical satellite control and data distribution platform. Open Cosmos is also using a VictoriaMetrics feature that lets it take metrics from satellites and ground equipment across different labs and test facilities, before uploading them to mission control software.

“The health of our customers’ space assets is highly important, and VictoriaMetrics’ monitoring is crucial for ensuring our satellites remain healthy, playing an indispensable role in powering our satellite alert system,” said Open Source ground segment technical lead Pep Rodeja.

“The fact that VictoriaMetrics is completely open source has been a massive benefit too, allowing us to fork the technology to space-specific problems far beyond our initial expectations.”

Data is the new oil

Speaking about the company’s growth, VictoriaMetrics co-founder Roman Khavronenko told SiliconRepublic.com that the start-up was “in the right time, in the right place”.

He said that “observability” became more of a focus for companies in recent years, and good systems were needed to collect and process data.

“Data is like a new oil,” Khavronenko added. “The more data you have, the more insight you have and the more predictions you can build on that.

“VictoriaMetrics was designed to address these high-scalability requirements for monitoring systems and remain simple and reliable at the same time.”

While its founders are based in Ukraine, VictoriaMetrics is headquartered in San Francisco and has an expanding team distributed across Europe and the US. Khavronenko said the company’s main aim in the future is developing its team, as success does not come from the product but “the team behind the product”.

“In the next three, five years, I hope that we will expand and build more independent teams inside VictoriaMetrics, which will be able to produce even better products to expand even further and bring better ideas and simplify observability in the world.”

10 things you need to know direct to your inbox every weekday. Sign up for the Daily Brief, Silicon Republic’s digest of essential sci-tech news.

Source link

Continue Reading

Technology

Another court case fails to unlock the mystery of bitcoin’s Satoshi Nakamoto | Bitcoin

Voice Of EU

Published

on

Who is Satoshi Nakamoto? The mysterious inventor of bitcoin is a renowned figure in the world of cryptocurrency but his true identity is unknown.

However, the British blogger Peter McCormack was certain about one thing: the answer isn’t Craig Wright.

For years Wright, an Australian computer scientist, has claimed that he is Satoshi, the pseudonymous author of the 2008 white paper behind bitcoin.

Wright’s assertion that he is the inventor of the digital asset – he first sought to prove that he is Satoshi in 2016, months after his name first emerged – has led to a series of legal tussles, some of which are continuing.

One of them came to a pyrrhic conclusion in London this week, when McCormack was found to have caused serious harm to Wright’s reputation by repeatedly claiming that he is a fraud and is not Satoshi.

But Wright, 52, won nominal damages of £1 after a high court judge ruled that he had given “deliberately false evidence” to support his libel claim.

For cost reasons, McCormack did not offer a defence of truth – where the defendant in the case attempts to show that the allegations are substantially true – as Mr Justice Chamberlain ruled that one claim made in a video discussion on YouTube was defamatory, while a series of tweets repeating the fraud claims were ruled to have caused serious harm to Wright’s reputation.

“Because he [Wright] advanced a deliberately false case and put forward deliberately false evidence until days before trial, he will recover only nominal damages,” wrote the judge.

McCormack’s defence, shifted to a much narrower footing, was that the video and the tweets did not cause serious harm to Wright’s reputation. Wright claimed that his reputation had been seriously harmed by the tweets because he had been disinvited from 10 conferences, which meant that academic papers due to be presented at those events had not been published.

McCormack submitted evidence from conference organisers who challenged Wright’s claims. Those claims were then dropped from Wright’s case at the trial in May.

The judge was scathing. He said: “Dr Wright’s original case on serious harm, and the evidence supporting it, both of which were maintained until days before trial, were deliberately false.”

Wright, who lives in Surrey and is the chief scientist at the blockchain technology firm nChain, said he had brought the case “not for financial reward, but for the principle and to get others to think twice before seeking to impugn my reputation”.

And the legal cases continue to pile up. Wright has other high court cases pending. He has brought a libel case against a Norwegian Twitter user, Marcus Granath, who has also accused the Australian of being a fraud. Granath recently failed in an attempt to have the case thrown out.

Wright is also suing two cryptocurrency exchanges in a case that argues that a digital asset called Bitcoin Satoshi Vision (BSV), which he backs, is the true descendant of the white paper.

The Crypto Open Patent Alliance (Copa), a non-profit that supports cryptocurrencies, is seeking a high court declaration that Wright is not the author of the white paper. Its case claims that Wright forged evidence produced to support his assertion that he is Satoshi. Wright, who denies Copa’s claims, failed in an attempt to have the case struck out last year.

There was more legal back and forth before that. In 2020, Wright lost an attempt to sue Roger Ver, an early bitcoin backer, for calling Wright a fraud on YouTube after a judge ruled that the appropriate jurisdiction for a lawsuit would be the US. One year later, Wright won a copyright infringement claim against the anonymous operator and publisher of the bitcoin.org website for publishing the white paper. Wright won by default after bitcoin.org’s publisher, who goes by the pseudonym of Cobra, declined to speak in their defence.

In the US, Wright won a case in December that spared him having to pay out a multibillion-dollar sum in bitcoins to the family of David Kleiman, a former business partner. Kleiman’s family had claimed that he was a co-creator of bitcoin along with Wright and they were therefore owed half of the 1.1m bitcoins “mined” by Satoshi.

The case was closely watched in the expectation that if Wright lost he would have had to move those bitcoins – seen as the sword-in-the-stone test that would prove Satoshi’s true identity. Those coins are now worth $25bn (£21bn) at the current price of about $23,000 and sit on the bitcoin blockchain, a decentralised ledger that records all bitcoin transactions.

Satoshi published the cryptocurrency’s foundation text – Bitcoin: A Peer-to-Peer Electronic Cash System – on 31 October 2008 and communicated by email with the currency’s first adherents before disappearing in 2011.

Carol Alexander, professor of finance at University of Sussex business school, says Wright could prove that he is Satoshi by using the so-called private keys – a secure code comprising a hexadecimal string of numbers and letters – that will unlock access to the bitcoins.

“The only way that Wright could prove he is SN would be to make a transaction with some of the original bitcoin,” she said.

Wright is adamant that he will not do this, saying private keys do not prove ownership or identity. There are few other Satoshi candidates. In 2014, a Japanese-American man, Dorian S Nakamoto, was named by Newsweek as the creator of bitcoin and promptly denied any link to the digital currency. More informed speculation has centred on Nick Szabo, an American computer scientist who designed BitGold, viewed as a conceptual precursor to bitcoin. But he too has denied claims that he might be Satoshi.

In the meantime, Mr Justice Chamberlain left open a question that remains unanswered. “The identity of Satoshi is not among the issues I have to determine,” he said.

Source link

Continue Reading

Trending

Subscribe To Our Newsletter

Join our mailing list to receive the latest news and updates 
directly on your inbox.

You have Successfully Subscribed!