TY - GEN
T1 - How to incentivize data-driven collaboration among competing parties
AU - Azar, Pablo Daniel
AU - Goldwasser, Shafi
AU - Park, Sunoo
PY - 2016/1/14
Y1 - 2016/1/14
N2 - The availability of vast amounts of data is changing how we can make medical discoveries, predict global market trends, save energy, and develop new educational strategies. In certain settings such as Genome Wide Association Studies or deep learning, the sheer size of data (patient files or labeled examples) seems critical to making discoveries. When data is held distributedly by many parties, as often is the case, they must share it to reap its full benefits. One obstacle is the reluctance of different entities to share their data, due to privacy concerns or loss of competitive edge. Work on cryptographic multi-party computation over the last 30 years address the privacy aspects, but sheds no light on individual parties' losses and gains when access to data carries tangible rewards. Even if it is clear that better overall conclusions can be drawn from collaboration, are individual collaborators better off by collaborating? Addressing this question is the topic of this paper. The order in which collaborators receive the outputs of a collaboration will be a crucial aspect of our modeling and solutions. We believe that timing is an important and unaddressed issue in data-based collaborations. Our contributions are as follows. We formalize a model of n-party collaboration for computing functions over private inputs in which the participants receive their outputs in sequence, and the order depends on their private inputs. Each output \improves" on all previous outputs according to a reward function. We say that a mechanism for collabo-ration achieves a collaborative equilibrium if it guarantees a higher reward for all participants when joining a collaboration compared to not joining it. We show that while in general computing a collaborative equilibrium is NP-complete, we can design polynomial-Time algorithms for computing it for a range of natural model settings. When possible, we design mechanisms to compute a distribution of outputs and an ordering of output delivery, based on the n participants' private inputs, which achieves a collaborative equilibrium. The collaboration mechanisms we develop are in the standard model, and thus require a central trusted party; however, we show that this assumption is not necessary under standard cryptographic assumptions. We show how the mechanisms can be implemented in a decentralized way by n distrustful parties using new extensions of classical secure multiparty computation that impose order and timing constraints on the delivery of outputs to different players, in addition to guaranteeing privacy and correctness.
AB - The availability of vast amounts of data is changing how we can make medical discoveries, predict global market trends, save energy, and develop new educational strategies. In certain settings such as Genome Wide Association Studies or deep learning, the sheer size of data (patient files or labeled examples) seems critical to making discoveries. When data is held distributedly by many parties, as often is the case, they must share it to reap its full benefits. One obstacle is the reluctance of different entities to share their data, due to privacy concerns or loss of competitive edge. Work on cryptographic multi-party computation over the last 30 years address the privacy aspects, but sheds no light on individual parties' losses and gains when access to data carries tangible rewards. Even if it is clear that better overall conclusions can be drawn from collaboration, are individual collaborators better off by collaborating? Addressing this question is the topic of this paper. The order in which collaborators receive the outputs of a collaboration will be a crucial aspect of our modeling and solutions. We believe that timing is an important and unaddressed issue in data-based collaborations. Our contributions are as follows. We formalize a model of n-party collaboration for computing functions over private inputs in which the participants receive their outputs in sequence, and the order depends on their private inputs. Each output \improves" on all previous outputs according to a reward function. We say that a mechanism for collabo-ration achieves a collaborative equilibrium if it guarantees a higher reward for all participants when joining a collaboration compared to not joining it. We show that while in general computing a collaborative equilibrium is NP-complete, we can design polynomial-Time algorithms for computing it for a range of natural model settings. When possible, we design mechanisms to compute a distribution of outputs and an ordering of output delivery, based on the n participants' private inputs, which achieves a collaborative equilibrium. The collaboration mechanisms we develop are in the standard model, and thus require a central trusted party; however, we show that this assumption is not necessary under standard cryptographic assumptions. We show how the mechanisms can be implemented in a decentralized way by n distrustful parties using new extensions of classical secure multiparty computation that impose order and timing constraints on the delivery of outputs to different players, in addition to guaranteeing privacy and correctness.
UR - http://www.scopus.com/inward/record.url?scp=84966669883&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84966669883&partnerID=8YFLogxK
U2 - 10.1145/2840728.2840758
DO - 10.1145/2840728.2840758
M3 - Conference contribution
AN - SCOPUS:84966669883
T3 - ITCS 2016 - Proceedings of the 2016 ACM Conference on Innovations in Theoretical Computer Science
SP - 213
EP - 225
BT - ITCS 2016 - Proceedings of the 2016 ACM Conference on Innovations in Theoretical Computer Science
PB - Association for Computing Machinery, Inc
T2 - 7th ACM Conference on Innovations in Theoretical Computer Science, ITCS 2016
Y2 - 14 January 2016 through 16 January 2016
ER -